{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Otto商品分类——LightGBM\n",
    "原始特征+tfidf特征"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我们以Kaggle 2015年举办的Otto Group Product Classification Challenge竞赛数据为例，对LightGBM的超参数进行调优。\n",
    "\n",
    "Otto数据集是著名电商Otto提供的一个多类商品分类问题，类别数=9. 每个样本有93维数值型特征（整数，表示某种事件发生的次数，已经进行过脱敏处理）。 竞赛官网：https://www.kaggle.com/c/otto-group-product-classification-challenge/data\n",
    "\n",
    "\n",
    "第一名：https://www.kaggle.com/c/otto-group-product-classification-challenge/discussion/14335\n",
    "第二名：http://blog.kaggle.com/2015/06/09/otto-product-classification-winners-interview-2nd-place-alexander-guschin/"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [],
   "source": [
    "# 首先 import 必要的模块\n",
    "import pandas as pd \n",
    "import numpy as np\n",
    "\n",
    "import lightgbm as lgbm\n",
    "from lightgbm.sklearn import LGBMClassifier\n",
    "\n",
    "from sklearn.model_selection import GridSearchCV\n",
    "\n",
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 读取数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "pycharm": {
     "is_executing": false
    },
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "# 读取数据\n",
    "# 这里使用原始特征+tf_idf特征，log(x+1)特征为原始特的单调变换，加上log特征对决策树模型影响不大\n",
    "# path to where the data lies\n",
    "dpath = './data/'\n",
    "\n",
    "train1 = pd.read_csv(dpath +\"Otto_FE_train_org.csv\")\n",
    "train2 = pd.read_csv(dpath +\"Otto_FE_train_tfidf.csv\")\n",
    "\n",
    "#去掉多余的id\n",
    "train2 = train2.drop([\"id\",\"target\"], axis=1)\n",
    "train =  pd.concat([train1, train2], axis = 1, ignore_index=False)\n",
    "\n",
    "train.head()\n",
    "\n",
    "del train1\n",
    "del train2\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "pycharm": {
     "is_executing": false,
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.frame.DataFrame'>\n",
      "RangeIndex: 61878 entries, 0 to 61877\n",
      "Columns: 188 entries, id to feat_93_tfidf\n",
      "dtypes: float64(186), int64(1), object(1)\n",
      "memory usage: 88.8+ MB\n"
     ]
    }
   ],
   "source": [
    "train.info()\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 准备数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [],
   "source": [
    "# 将类别字符串变成数字，LightGBM不支持字符串格式的特征输入/标签输入\n",
    "y_train = train['target'] #形式为Class_x\n",
    "y_train = y_train.map(lambda s: s[6:])\n",
    "y_train = y_train.map(lambda s: int(s) - 1)#将类别的形式由Class_x变为0-8之间的整数\n",
    "\n",
    "X_train = train.drop([\"id\", \"target\"], axis=1)\n",
    "\n",
    "#保存特征名字以备后用（可视化）\n",
    "feat_names = X_train.columns \n",
    "\n",
    "#sklearn的学习器大多之一稀疏数据输入，模型训练会快很多\n",
    "#查看一个学习器是否支持稀疏数据，可以看fit函数是否支持: X: {array-like, sparse matrix}.\n",
    "#可自行用timeit比较稠密数据和稀疏数据的训练时间\n",
    "from scipy.sparse import csr_matrix\n",
    "X_train = csr_matrix(X_train)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## LightGBM超参数调优"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "LightGBM的主要的超参包括：\n",
    "1. 树的数目n_estimators 和 学习率 learning_rate\n",
    "2. 树的最大深度max_depth 和 树的最大叶子节点数目num_leaves（注意：XGBoost只有max_depth，LightGBM采用叶子优先的方式生成树，num_leaves很重要，设置成比 2^max_depth 小）\n",
    "3. 叶子结点的最小样本数:min_data_in_leaf(min_data, min_child_samples)\n",
    "4. 每棵树的列采样比例：feature_fraction/colsample_bytree\n",
    "5. 每棵树的行采样比例：bagging_fraction （需同时设置bagging_freq=1）/subsample\n",
    "6. 正则化参数lambda_l1(reg_alpha), lambda_l2(reg_lambda)\n",
    "\n",
    "7. 两个非模型复杂度参数，但会影响模型速度和精度。可根据特征取值范围和样本数目修改这两个参数\n",
    "1）特征的最大bin数目max_bin：默认255；\n",
    "2）用来建立直方图的样本数目subsample_for_bin：默认200000。\n",
    "\n",
    "对n_estimators，用LightGBM内嵌的cv函数调优，因为同XGBoost一样，LightGBM学习的过程内嵌了cv，速度极快。\n",
    "其他参数用GridSearchCV"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [],
   "source": [
    "MAX_ROUNDS = 10000"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 相同的交叉验证分组"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [],
   "source": [
    "# prepare cross validation\n",
    "from sklearn.model_selection import StratifiedKFold\n",
    "\n",
    "kfold = StratifiedKFold(n_splits=3, shuffle=True, random_state=3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1. n_estimators"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [],
   "source": [
    "#直接调用lightgbm内嵌的交叉验证(cv)，可对连续的n_estimators参数进行快速交叉验证\n",
    "#而GridSearchCV只能对有限个参数进行交叉验证，且速度相对较慢\n",
    "def get_n_estimators(params , X_train , y_train , early_stopping_rounds=10):\n",
    "    lgbm_params = params.copy()\n",
    "    lgbm_params['num_class'] = 9\n",
    "     \n",
    "    lgbmtrain = lgbm.Dataset(X_train , y_train )\n",
    "     \n",
    "    #num_boost_round为弱分类器数目，下面的代码参数里因为已经设置了early_stopping_rounds\n",
    "    #即性能未提升的次数超过过早停止设置的数值，则停止训练\n",
    "    cv_result = lgbm.cv(lgbm_params , lgbmtrain , num_boost_round=MAX_ROUNDS , nfold=3,  metrics='multi_logloss' , early_stopping_rounds=early_stopping_rounds,seed=3 )\n",
    "     \n",
    "    print('best n_estimators:' , len(cv_result['multi_logloss-mean']))\n",
    "    print('best cv score:' , cv_result['multi_logloss-mean'][-1])\n",
    "     \n",
    "    return len(cv_result['multi_logloss-mean'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "best n_estimators: 394\n",
      "best cv score: 0.4767505203274461\n"
     ]
    }
   ],
   "source": [
    "params = {'boosting_type': 'gbdt',\n",
    "          'objective': 'multiclass',\n",
    "          'n_jobs': 4,\n",
    "          'learning_rate': 0.1,\n",
    "          'num_leaves': 60,\n",
    "          'max_depth': 6,\n",
    "          'max_bin': 127, #2^6,原始特征为整数，很少超过100\n",
    "          'subsample': 0.7,\n",
    "          'bagging_freq': 1,\n",
    "          'colsample_bytree': 0.7,\n",
    "         }\n",
    "\n",
    "n_estimators_1 = get_n_estimators(params , X_train , y_train)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2. num_leaves & max_depth=7\n",
    "num_leaves建议70-80，搜索区间50-80,值越大模型越复杂，越容易过拟合\n",
    "相应的扩大max_depth=7"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Fitting 3 folds for each of 4 candidates, totalling 12 fits\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.\n",
      "[Parallel(n_jobs=4)]: Done   8 out of  12 | elapsed: 17.0min remaining:  8.5min\n",
      "[Parallel(n_jobs=4)]: Done  12 out of  12 | elapsed: 24.0min finished\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "GridSearchCV(cv=StratifiedKFold(n_splits=3, random_state=3, shuffle=True),\n",
       "             error_score='raise-deprecating',\n",
       "             estimator=LGBMClassifier(bagging_freq=1, boosting_type='gbdt',\n",
       "                                      class_weight=None, colsample_bytree=0.7,\n",
       "                                      importance_type='split',\n",
       "                                      learning_rate=0.1, max_bin=127,\n",
       "                                      max_depth=7, min_child_samples=20,\n",
       "                                      min_child_weight=0.001,\n",
       "                                      min_split_gain=0.0, n_estimators=394,\n",
       "                                      n_jobs=4, num_class=9, num_leaves=31,\n",
       "                                      objective='multiclass', random_state=None,\n",
       "                                      reg_alpha=0.0, reg_lambda=0.0,\n",
       "                                      silent=False, subsample=0.7,\n",
       "                                      subsample_for_bin=200000,\n",
       "                                      subsample_freq=0),\n",
       "             iid='warn', n_jobs=4, param_grid={'num_leaves': range(50, 90, 10)},\n",
       "             pre_dispatch='2*n_jobs', refit=False, return_train_score='warn',\n",
       "             scoring='neg_log_loss', verbose=5)"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "params = {'boosting_type': 'gbdt',\n",
    "          'objective': 'multiclass',\n",
    "          'num_class':9, \n",
    "          'n_jobs': 4,\n",
    "          'learning_rate': 0.1,\n",
    "          'n_estimators':n_estimators_1,\n",
    "          'max_depth': 7,\n",
    "          'max_bin': 127, #2^6,原始特征为整数，很少超过100\n",
    "          'subsample': 0.7,\n",
    "          'bagging_freq': 1,\n",
    "          'colsample_bytree': 0.7,\n",
    "         }\n",
    "lg = LGBMClassifier(silent=False,  **params)\n",
    "\n",
    "num_leaves_s = range(50,90,10) #50,60,70,80\n",
    "tuned_parameters = dict( num_leaves = num_leaves_s)\n",
    "\n",
    "grid_search = GridSearchCV(lg, n_jobs=4, param_grid=tuned_parameters, cv = kfold, scoring=\"neg_log_loss\", verbose=5, refit = False,return_train_score='warn'\n",
    ")\n",
    "grid_search.fit(X_train , y_train)\n",
    "#grid_search.best_estimator_"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "pycharm": {
     "is_executing": false
    },
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.47722504269785376\n",
      "{'num_leaves': 80}\n"
     ]
    }
   ],
   "source": [
    "# examine the best model\n",
    "print(-grid_search.best_score_)\n",
    "print(grid_search.best_params_)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "pycharm": {
     "is_executing": false
    },
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZUAAAEHCAYAAABm9dtzAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjAsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+17YcXAAAgAElEQVR4nO3deXxV9Z3/8dfn5pKwBQgkoMiOiCyyRgiKto523MGxrIK7QuuCTm2n7cyvndbaqaNdFNTKoqIUBbRacWnVsS6AgIRNWUSQNYAssq8hyef3xz3BiAEC3JuTm7yfj0ceyVnuuZ+vx+TN93vO/R5zd0REROIhEnYBIiJSeShUREQkbhQqIiISNwoVERGJG4WKiIjETTTsAsKUmZnpLVq0CLsMEZGkMnfu3K3unlXatiodKi1atCA3NzfsMkREkoqZrTnaNg1/iYhI3ChUREQkbhQqIiISNwoVERGJG4WKiIjEjUJFRETiRqEiIiJxo1A5CV/uPMAzM1ahxwaIiHyTQuUkTJqzll+/toTbn5vLzn2Hwi5HRKTCUKichHsubsMvrmrP+8s2c8XIaSxYtyPskkREKgSFykkwM27t3ZIXf9ALgP5PfqThMBERFCqnpGuzDN4Y0ZsL22Tx69eWcMfEeew6oOEwEam6FCqnqF7NVMbekM3PLz+bt5ds4qqR01m0fmfYZYmIhEKhEgeRiDH8O62ZPCyH/IIirn3iIybMWqPhMBGpchQqcZTdoj5vjOhNTusG/OJvi7j7hfnsOVgQdlkiIuVGoRJnDWqnMf6mc/nJpW1589ON9Bk1naUbd4VdlohIuVCoJEAkYtx50ZlMvC2H3QcLuObxGUz6eK2Gw0Sk0lOoJFCv1g14c8QFZLfI4Gcvf8p9UxayL1/DYSJSeSlUEiwrPY3nbunJvZe04ZUF6+nz2Aw+37Q77LJERBJCoVIOUiLGvZecxV9u7cmOffn0fWwGL83NC7ssEZG4U6iUo/PPzOTNERfQqUldfvziQv7jpYXszy8MuywRkbhRqJSzhnWqM/G2ntx10ZlMyc3jmsdn8MWWPWGXJSISFwqVEERTIvz40raMv/lcNu8+wNWjpvPqgvVhlyUicsoUKiH6btuGvHnPBbQ/vQ73TFrAf77yKQcOaThMRJKXQiVkp9etwQvDchj+nVY8P3st1z7xEau37g27LBGRk6JQqQCqpUT4+eXteOrGbNbv2M9Vo6bz5qcbwy5LROSEKVQqkIvbNeKNEb05s2Ft7pg4j/9+dREHCzQcJiLJQ6FSwTTJqMmU4b245fyWPDtzDf2fnMm6bfvCLktEpEwUKhVQajTCL69uz5NDu7Nq616uHDmNtxd/GXZZIiLHpVCpwC7reBpv3H0BzRvUYtiEuTzw+hIOFRaFXZaIyFEpVCq4Zg1q8tIPe3FDr+aMm76KAaNnsn7H/rDLEhEplUIlCaRFU7i/b0ceu64ryzft4cqR0/jnZ5vCLktE5FsUKknkqk6Nee3u3pxetwa3jM/lwb9/RoGGw0SkAlGoJJmWmbV45Y7zGNyjGU9+8AWDx87iy50Hwi5LRARQqCSl6tVS+N215/DIwC4s3rCLK0ZO48PPt4RdloiIQiWZXdP1DKbe1ZvM2qnc+MzH/OHtZRQW6ZHFIhIehUqSO7NhbV69szf9ujVh1D9XMHTcbDbv1nCYiIQjoaFiZpeZ2TIzW2FmPzvGfv3MzM0sO1geYmYLSnwVmVmXYNtgM/vUzD4xs3+YWWawvr6ZvWNmy4PvGYlsW0VSIzWFh/t35uF+nZi/bjtXPDqdj1ZsDbssEamCEhYqZpYCPA5cDrQHBptZ+1L2SwdGALOL17n7RHfv4u5dgOuB1e6+wMyiwKPARe7eCfgEuCt42c+Ad929DfBusFyl9M9uyqt39qZujShDn5rNyHeXazhMRMpVInsqPYAV7r7S3fOBSUDfUvb7DfAQcLQxm8HAC8HPFnzVMjMD6gAbgm19gWeDn58FrjnlFiShtqelM/Wu3vTp3Jg/vvM5Nz3zMVv3HAy7LBGpIhIZKmcA60os5wXrDjOzrkBTd3/9GMcZSBAq7n4I+CHwKbEwaQ88FezXyN03BvttBBrGoQ1JqVZalD8N7MLvrj2H2au2ceXIacxe+VXYZYlIFZDIULFS1h0eizGzCPAn4L6jHsCsJ7DP3RcFy9WIhUpXoDGx4a+fn1BRZsPMLNfMcrdsqby34ZoZg3s045U7zqNGtRSuGzebJ95fQZGGw0QkgRIZKnlA0xLLTfh6qAogHegIvG9mq4EcYGrxxfrAIL4e+gLoAuDuX7i7A1OA84Jtm8zsdIDg++bSinL3Me6e7e7ZWVlZJ9u2pNGhcV1eu7s3l3U8jYf+sYxbn53D9r35YZclIpVUIkNlDtDGzFqaWSqxgJhavNHdd7p7pru3cPcWwCygj7vnwuGeTH9i12KKrQfam1lxGnwPWBr8PBW4Mfj5RuDVxDQr+aRXr8Zjg7vym74dmLHiK64YOY25a7aFXZaIVEIJCxV3LyB2Z9ZbxP7wT3H3xWZ2v5n1KcMhLgTy3H1liWNuAH4NfGhmnxDrufxPsPlB4HtmtpxY2DwYv9YkPzPj+l4t+OsPzyOaYgwcPYuxH64k1uETEYkPq8p/VLKzsz03NzfsMsrdzv2H+I+XFvLW4k1c0q4Rf+jfmbo1q4VdlogkCTOb6+7ZpW3TJ+qroLo1qvHk0O784qr2vL9sM1eOmsbCdTvCLktEKgGFShVlZtzauyUv/qAX7tDvyY94ZsYqDYeJyClRqFRxXZtl8MaI3lzYJotfv7aEOybOY9eBQ2GXJSJJSqEi1KuZytgbsvn55Wfz9pJNXD1qOovW7wy7LBFJQgoVASASMYZ/pzWTh+Vw8FAR1z7xEX+ZtUbDYSJyQhQq8g3ZLerzxoje5LRuwP/72yJGTFrAnoMFYZclIklCoSLf0qB2GuNvOpefXNqWNz7ZQJ9R01m6cVfYZYlIElCoSKkiEePOi85k4m057D5YwDWPz2DynLUaDhORY1KoyDH1at2AN0dcQHaLDH7610+5b8pC9uVrOExESqdQkePKSk/juVt6cu8lbXhlwXr6PjaD5Zt2h12WiFRAChUpk5SIce8lZzHhlp5s35dPn8dm8Ne5eWGXJSIVjEJFTkjvNpm8MeICOjWpy30vLuSnL33CgUOFYZclIhWEQkVOWKM61Zl4W0/uvKg1k3PXcc3jM/hiy56wyxKRCkChIiclmhLhJ5eezfibz2XTrgP0GTWdVxesD7ssEQmZQkVOyXfbNuTNey6g3el1uGfSAv7rlU81HCZShSlU5JSdXrcGLwzLYfh3WjFx9lq+/+ePWL11b9hliUgIFCoSF9VSIvz88nY8dWM2edv3c/Wo6fz9041hlyUi5UyhInF1cbtGvDGiN60b1uaHE+fxq6mLOVig4TCRqkKhInHXJKMmU4b34pbzWzL+o9X0f3Im67btC7ssESkHChVJiNRohF9e3Z4nh3Zn1da9XDlyGm8v/jLsskQkwRQqklCXdTyNN+6+gOYNajFswlweeH0JhwqLwi5LRBJEoSIJ16xBTV76YS9u6NWccdNXMWD0TNbv2B92WSKSAAoVKRdp0RTu79uRx67ryvJNe7hy5DT++dmmsMsSkThTqEi5uqpTY167uzen163BLeNzefDvn1Gg4TCRSkOhIuWuZWYtXrnjPAb3aMaTH3zB4LGz+HLngbDLEpE4UKhIKKpXS+F3157DIwO7sHjDLq4YOY0PP98SdlkicooUKhKqa7qewdS7epNZO5Ubn/mYP7y9jMIiPbJYJFkpVCR0Zzaszat39qZftyaM+ucKho6bzebdGg4TSUYKFakQaqSm8HD/zjzcrxPz123niken89GKrWGXJSInSKEiFUr/7Ka8emdv6taIMvSp2Yx8d7mGw0SSiEJFKpy2p6Uz9a7e9OncmD++8zk3PfMxW/ccDLssESkDhYpUSLXSovxpYBd+d+05zF61jStHTmP2yq/CLktEjkOhIhWWmTG4RzNeueM8alRL4bpxs3ni/RUUaThMpMJSqEiF16FxXV67uzeXdTyNh/6xjFufncP2vflhlyUipUhoqJjZZWa2zMxWmNnPjrFfPzNzM8sOloeY2YISX0Vm1sXM0o9Yv9XMHgle08zM3jOz+Wb2iZldkci2SflKr16NxwZ35Td9OzBjxVdcMXIac9dsC7ssETlCwkLFzFKAx4HLgfbAYDNrX8p+6cAIYHbxOnef6O5d3L0LcD2w2t0XuPvu4vXBtjXAy8HL/h8wxd27AoOAJxLVNgmHmXF9rxb89YfnEU0xBo6exdgPV+Ku4TCRiiKRPZUewAp3X+nu+cAkoG8p+/0GeAg42qfdBgMvHLnSzNoADYFpwSoH6gQ/1wU2nHzpUpGd06Qur999ARe3a8hv31zK7c/NZee+Q2GXJSIkNlTOANaVWM4L1h1mZl2Bpu7++jGOM5BSQoVY2Ez2r/+Z+itgqJnlAW8Cd5d2MDMbZma5Zpa7ZYvmmkpWdWtU48mh3fnFVe15f9lmrhw1jYXrdoRdlkiVl8hQsVLWHR6nMLMI8CfgvqMewKwnsM/dF5WyeRDfDJvBwHh3bwJcAUwI3uObBbiPcfdsd8/OysoqW0ukQjIzbu3dkhd/0At36PfkR9z1/Dyenr6KBet2kF+gKfVFyls0gcfOA5qWWG7CN4ek0oGOwPtmBnAaMNXM+rh7brDPkcEBgJl1BqLuPrfE6luBywDcfaaZVQcygc3xaY5UVF2bZfDGiN787s3PmLZ8C69/shGAtGiEc86oS7fmGXRrVo9uzTJoWKd6yNWKVG6JDJU5QBszawmsJxYQ1xVvdPedxP7oA2Bm7wM/Lg6UoJfRH7iwlGOXdp1lLXAxMN7M2gHVAY1vVRH1aqbyv/06AbBx537mrdnBvLXbmbd2O+NnrGbMh7FeS5OMGnRrFoRM8wzanV6Haim6s14kXo4bKmbWGshz94Nm9l2gE/Ccux9zANvdC8zsLuAtIAV42t0Xm9n9QK67Tz3OW18YvO/KUrYNIDbEVdJ9wFgz+3diw2w3uW4LqpJOr1uDKzvV4MpOpwNwsKCQRet3MT8ImY9XbWPqwlinuXq1CJ3OqEfX5vXo3iyDbs0zyKydFmb5IknNjvd318wWANlAC2IBMRVo6+5J/zmQ7Oxsz83NPf6OUuls2LGfuWu2B72ZHSzZsJNDhbHfhWb1ax7uyXRrlsHZp6UTVW9G5DAzm+vu2aVtK8vwV1HQ6/g34BF3H2Vm8+Nbokj5alyvBo3r1eDqzo0BOHCokEXrd8ZCZs0OPvriK/62INabqVEthU5NYtdmujfLoGuzejRQb0akVGUJlUNmNhi4Ebg6WFctcSWJlL/q1VLIblGf7Bb1AXB31ge9mflrY9dnxn64kj8H8461aFCTbs0y6BrcBNC2kXozIlC2ULkZ+AHwW3dfFVx4/0tiyxIJl5nRJKMmTTJq0rdL7ONV+/ML+fRwb2Y7Hy7fysvz1wNQMzWFzk3q0b15Bt2a16Nr0wwyaqWG2QSRUBz3mso3djbLIPZhxU8SV1L50TUVORXuzrpt+w/fZTZv7XaWbtx9+KFirTJr0bVZLGS6NcvgrEbppERK+/iWSHI51jWVslyofx/oQ6xXs4DYbbofuPuP4lxnuVOoSLztyy/gk7yvr83MW7udbcGMyrXTonRuWjd2XaZ5Bt2aZlC3pkaSJfmc6oX6uu6+y8xuA55x9/82s0rRUxGJt5qpUXJaNSCnVQMg1ptZ89W+r3sza3bw2HsrKH4kTOusWrHPzQR3mrVpWJuIejOSxMoSKlEzO53YZ0P+K8H1iFQqZkaLzFq0yKzFtd2aALD3YAEL83bEbgBYs53/W7qJF+fmAZCeFqVL8On/bs0z6NK0HnVrqDcjyaMsoXI/sc+nzHD3OWbWClie2LJEKq9aaVHOa53Jea1jE0q4O6u27mVecJfZvDXbGfXP5RQ5mMGZWbWDkImFTess9Wak4jqhC/WVja6pSEW152ABC9fFejLFH9DcuT82vX+d6tHYDQBB0HRpWo/06urNSPk5pWsqZtYEGAWcT2z6k+nAPe6eF9cqReSw2mlRzj8zk/PPjPVmioqclVv3Mm/t9th0M2t28Mi7n+NBb+ashumxW5mDsGmdVYtgolaRclWWu7/eAZ4HJgSrhgJD3P17Ca4t4dRTkWS268ChoDezg7lB2Ow+UABAvZrV6Nr062sznZvWo3ZaIuePlarkVG8pXhA8uveY65KRQkUqk6Ii54ste75xO/PyzXsAiBic1Sj98F1m3ZrVo2WmejNyck71luKtZjaUr6eaHwx8Fa/iRCQ+IhGjTaN02jRKZ+C5zQDYuf8QC0pcm3ltwQaen70WgPq1UmO9meax+cw6N6lHLfVm5BSV5f+gW4DHiD2l0YGPiE3dIiIVXN0a1fjOWVl856zYU04Li5wVm/ccvsts3trtvPtZ7Dl2EYOzT6tDt+bBdDPNMmhWv6Z6M3JCTuruLzO7190fSUA95UrDXyKwY1/+4Ukz563dzoK1O9ibXwhAg1qp35hqpnOTetRITQm5YgnbKV1TOcoB17p7s1OuLGQKFZFvKyxyPt+0+/C1mflrt7Ny614AUiJGu9PT6dYs43BvpklGDfVmqphEhMo6d296/D0rNoWKSNls25t/+MmZ89bsYGHeDvYFvZnM2mnfeKhZpyZ1qV5NvZnK7FQv1Jem6n5iUqQKql8rlYvbNeLido0AKCgsYtmm3cxbu4P5wbWZt5dsAiAaMdo3rnP4duZerRqQla6HmlUVR+2pmNluSg8PA2q4e9LfJqKeikj8bN1z8OtrM2u280neTvYfKqRWagpP33QuPYNJNiX5xX34q7JQqIgkTkFhEYs37OK+FxeSt30f4244l95tMsMuS+LgWKGi55+KSEJEUyJ0blqPScNyaNGgFrc8O4f3gtuXpfJSqIhIQmXWTuOF23M4q1Fthk3I5e3FX4ZdkiSQQkVEEi6jVioTb8uhQ+O63DFxHq9/siHskiRBFCoiUi7q1qjGX27rSbdmGYx4YT6vzNdE55XRcUPFzHab2a4jvtaZ2SvBA7tERMqkdlqU8becS06rBvxoykKmzFkXdkkSZ2W5LfiPwAZi098bMAg4DVgGPA18N1HFiUjlUzM1ytM3ncvwCXP5j79+wsHCIq7PaR52WRInZRn+uszdR7v7bnff5e5jgCvcfTKQkeD6RKQSql4thTE3dOeSdg35xd8WMW7ayrBLkjgpS6gUmdkAM4sEXwNKbKu6H3IRkVOSFk3hiSHdueKc03jgjaU88f6KsEuSOCjL8NcQ4FHgiWB5JjDUzGoAdyWqMBGp/FKjEUYO6kq1lIU89I9lHDxUxL2XtNEElUnsuKHi7iuBq4+yeXp8yxGRqiaaEuGPA7qQmhLh0XeXk19YxH9c2lbBkqSOGypm1gQYBZxPbLhrOnCPu+t+QBGJi5SI8b/f70RqNMKf3/+Cg4eK+MVV7RQsSagsw1/PELvzq3+wPDRY971EFSUiVU8kYjxwTUdSoxGenrGK/MJC7u/TkUhEwZJMyhIqWe7+TInl8WZ2b6IKEpGqy8z45VXtSYum8OQHX5BfUMTvru1EioIlaZQlVLaa2VDghWB5MPBV4koSkarMzPjpZW1JiwbXWAqK+H3/zkRTNAFIMijLWboFGAB8CWwE+gE3l+XgZnaZmS0zsxVm9rNj7NfPzNzMsoPlIWa2oMRXkZl1MbP0I9ZvNbNHShxngJktMbPFZvZ8WWoUkYrHzPj3753FTy5ty98WbOCeSQs4VFgUdllSBmW5+2st0KfkumD465HSX3F4nxTgcWLXXvKAOWY21d2XHLFfOjACmF3iPScCE4Pt5wCvuvuCYHOXEq+dC7wc/NwG+DlwvrtvN7OGx2ubiFRsd150JmnRCA+8sZT8wiIeu64raVE9qrgiO9n+5I/KsE8PYIW7r3T3fGAS0LeU/X4DPAQcOMpxBvP10NthQYg0BKYFq24HHnf37QDurgc3iFQCt13Qivv7duCdJZsYPmEuBw4Vhl2SHMPJhkpZrpqdAZScLS4vWPf1Qcy6Ak3d/fVjHGcgpYQKsbCZ7F8/uvIs4Cwzm2Fms8zsslILNxtmZrlmlrtly5YyNENEwnZDrxY8eO05fPD5Fm59dg778gvCLkmO4mRDpSzTs5QWPIdfZ2YR4E/AfUc9gFlPYJ+7Lypl8yC+GTZRoA2xCS4HA+PMrN63CnAf4+7Z7p6dlZVVhmaISEUwqEcz/tC/MzO/+Iqbnp7DnoMKloroqKFylCnvd5nZbqBxGY6dBzQtsdyE2GzHxdKBjsD7ZrYayAGmFl+sDxwZHMW1dQai7j73iPd71d0PufsqYrMotylDnSKSJK7t1oSRg7syd+12rn9qNjv3Hwq7JDnCUUPF3dPdvU4pX+nuXpZbkecAbcyspZmlEguIqSWOv9PdM929hbu3AGYBfdw9Fw73ZPoTuxZzpNKus/wNuCh4bSax4TBNfSpSyVzVqTFPDOnGovU7GTJuFtv35oddkpSQsBu/3b2A2ISTbwFLgSnuvtjM7jezPsd+NQAXAnnB3GNHGsC3Q+Ut4CszWwK8B/zE3fV5GpFK6NIOpzHm+mw+37SHwWNnsXXPwbBLkoB9fZ276snOzvbc3NywyxCRkzRt+RZufy6XJhk1ef62njSsUz3skqoEM5vr7tmlbdNHVEUkaV3QJovxN/dgw479DBwzi40794ddUpWnUBGRpJbTqgETbu3B1t0HGTB6Juu27Qu7pCpNoSIiSa978/pMvL0nu/YXMHD0TFZv3Rt2SVWWQkVEKoVOTerx/O09OVBQxIDRM1mxeU/YJVVJChURqTQ6NK7LpGE5FDkMGjOTz77cFXZJVY5CRUQqlbMapTNleA7RSIRBY2axaP3OsEuqUhQqIlLptMqqzeThOdRKjXLd2FnMX7s97JKqDIWKiFRKzRvUYvLwHOrVTOX6pz5mzuptYZdUJShURKTSapJRkynDe9EwPY0bnvqYj1ZsDbukSk+hIiKV2ml1qzNpeA5N69fg5vFz+OBzPfIikRQqIlLpNUyvzqRhvWidVZvbn83l3aWbwi6p0lKoiEiVUL9WKs/f3pN2p6czfMJc/v7pxrBLqpQUKiJSZdSrmcqE23rSuWk97nphPq8uWB92SZWOQkVEqpQ61avx3C09OLdFBvdOXsCLueuO/yIpM4WKiFQ5tdKiPHNTD3qfmclPXvqE52evDbukSkOhIiJVUo3UFMbekM1FbbP4z1c+5ZkZq8IuqVJQqIhIlVW9Wgqjr8/m0g6N+PVrSxj9wRdhl5T0FCoiUqWlRiM8dl03rup0Or/7+2eMfHd52CUltWjYBYiIhK1aSoRHB3UlNRrhj+98Tn5BEff961mYWdilJR2FiogIkBIxft+vM6kpER57bwX5hUX8/PKzFSwnSKEiIhKIRIz/+bdzSItGGPPhSg4eKuS/r+5AJKJgKSuFiohICZGI8as+HUiNRhg7bRX5hUX89ppzFCxlpFARETmCmfGfV7QjLZrCY++t4GBBEQ/360yKguW4FCoiIqUwM358advDF+8PFTp/HNCZaim6afZYFCoiIscw4uI2pEYjPPj3z8gvKGTU4G6kRhUsR6P/MiIix/GD77Tmv69uz1uLN/GDv8zlwKHCsEuqsBQqIiJlcPP5Lfntv3Xkn59t5vbnctmfr2ApjUJFRKSMhvRszsP9OjF9xVZuHv8xew8WhF1ShaNQERE5Af2zm/LIwC7MWb2dG57+mF0HDoVdUoWiUBEROUF9u5zBY4O7snDdDq4fN5ud+xQsxRQqIiIn4fJzTufJod1ZunE3g8fOYtve/LBLqhAUKiIiJ+mS9o0Yc0N3vtiyh0FjZrJl98GwSwqdQkVE5BR8t21DnrnpXNZt28/AMTP5cueBsEsKVUJDxcwuM7NlZrbCzH52jP36mZmbWXawPMTMFpT4KjKzLmaWfsT6rWb2yLGOJSKSaOedmclzt/Zg866DDBg9k7zt+8IuKTQJCxUzSwEeBy4H2gODzax9KfulAyOA2cXr3H2iu3dx9y7A9cBqd1/g7ruL1wfb1gAvH+tYIiLl4dwW9Zlwaw+278tn4OhZrPlqb9glhSKRPZUewAp3X+nu+cAkoG8p+/0GeAg4Wp9xMPDCkSvNrA3QEJh2AscSEUmYrs0yeOH2HPbmFzBw9Cy+2LIn7JLKXSJD5QxgXYnlvGDdYWbWFWjq7q8f4zgDKSVUiIXNZHf3EziWiEhCdTyjLpOG5XCosIiBo2fx+abdYZdUrhIZKqXNEe2HN5pFgD8B9x31AGY9gX3uvqiUzYMIwqYsxypxzGFmlmtmuVu2bDne7iIiJ+zs0+oweXgOEYNBY2axZMOusEsqN4kMlTygaYnlJsCGEsvpQEfgfTNbDeQAU4+4wH44OEoys85A1N3nnsCxAHD3Me6e7e7ZWVlZJ9s2EZFjOrNhOpOH9yItGmHw2Fl8krcj7JLKRSJDZQ7QxsxamlkqsYCYWrzR3Xe6e6a7t3D3FsAsoI+758Lh3kd/YtdijvSN6yzHO5aISBhaZtZiyvBepFePMmTsbOau2R52SQmXsFBx9wLgLuAtYCkwxd0Xm9n9ZtanDIe4EMhz95WlbBtA6ddZREQqlKb1azJleC8a1E7l+qdmM2vlV2GXlFAWXOeukrKzsz03V50ZEUm8zbsOcN242eRt38e4G86ld5vMsEs6aWY2191L/SygPlEvIlIOGtapzqRhObRoUItbnp3De59tDrukhFCoiIiUk8zaabxwew5nNarNsAm5vLX4y7BLijuFiohIOcqolcrE23Lo0Lgud0ycx2sLNxz/RUlEoSIiUs7q1qjGX27rSfdmGdwzaT4vz8sLu6S4UaiIiISgdlqU8becS06rBtz34kImz1kbdklxoVAREQlJzdQoT990Lhe2yeKnf/2UCTNXh13SKVOoiIiEqHq1FMbc0J1L2jXkF68uZty00j6alzwUKiIiIUuLpvDEkO5ccc5pPPDGUh5/b0XYJZ20aNgFiIgIpEYjjBzUlWopC3n4rWUcLCji3y9pg1lpc+nzsEkAAArRSURBVPNWXAoVEZEKIpoS4Y8DupCaEmHku8vJLyjip5e1TapgUaiIiFQgKRHjf7/fidRohCc/+IKDBYX88qr2SRMsChURkQomEjEeuKYjqdEIz8xYTX5BEb/p25FIpOIHi0JFRKQCMjN+eVV70qIpPPnBF+QXFPHg9zuRUsGDRaEiIlJBmRk/vaxt7CL+u8vJLyziD/07E02puDfuKlRERCowM+NH3zuLtGiEh99axqHCIh4d1JVqFTRYFCoiIkngzovOJC0a4YE3lpJfMI/Hh3QlLZoSdlnfUjGjTkREvuW2C1pxf98O/N/STQx7bi4HDhWGXdK3KFRERJLIDb1a8OC15/Dh8i3cMn4O+/ILwi7pGxQqIiJJZlCPZvyhf2dmrfyKG5/+mN0HDoVd0mEKFRGRJHRttyaMHNyVeWt3cP1TH7Nzf8UIFoWKiEiSuqpTY54Y0o3FG3YyZNwstu/ND7skhYqISDK7tMNpjL6+O59v2sPgsbPYuudgqPUoVEREkty/nN2Ip27MZvVXexk4eiabdh0IrRaFiohIJXBBmyzG39yDjTsPMHD0TDbs2B9KHQoVEZFKIqdVAybc2oOv9uQzYPRM1m3bV+41KFRERCqR7s3rM/H2nuw+UMDA0TNZtXVvub6/QkVEpJLp1KQez9/ekwMFRQwcPZMVm3eX23srVEREKqEOjesyaVgORQ4DR89i6cZd5fK+ChURkUrqrEbpTB6eQzTFGDx2FovW70z4eypUREQqsdZZtZkyvBe1UqMMHjuL+Wu3J/T9FCoiIpVc8wa1mDw8h4yaqQwdN5uPV21L2HspVEREqoAmGTWZMrwXjepU58anP+ajFVsT8j4KFRGRKuK0utWZNDyHNo1q4wl6Dz35UUSkCmmYXp2/3XE+kYgl5PgJ7amY2WVmtszMVpjZz46xXz8zczPLDpaHmNmCEl9FZtbFzNKPWL/VzB4JXvMjM1tiZp+Y2btm1jyRbRMRSVaJChRIYE/FzFKAx4HvAXnAHDOb6u5LjtgvHRgBzC5e5+4TgYnB9nOAV919QbC5S4nXzgVeDhbnA9nuvs/Mfgg8BAxMRNtERKR0ieyp9ABWuPtKd88HJgF9S9nvN8QC4GjTag4GXjhypZm1ARoC0wDc/T13L57oZhbQ5NTKFxGRE5XIUDkDWFdiOS9Yd5iZdQWauvvrxzjOQEoJFWJhM9ndS7vedCvw99IOZmbDzCzXzHK3bNlyrPpFROQEJTJUShu0OxwAZhYB/gTcd9QDmPUE9rn7olI2D6L0HsxQIBt4uLRjuvsYd8929+ysrKxjt0BERE5IIkMlD2haYrkJsKHEcjrQEXjfzFYDOcDU4ov1gaMFR2cg6u5zj1h/CfBfQB93D/fxZyIiVVAibymeA7Qxs5bAemIBcV3xRnffCWQWL5vZ+8CP3T03WI4A/YELSzn2t66zBENpo4HL3H1zXFsiIiJlkrBQcfcCM7sLeAtIAZ5298Vmdj+Q6+5Tj3OIC4E8d19ZyrYBwBVHrHsYqA28aGYAa929zyk1QkREToiVfp27ajCzLcCak3x5JpCYeQ7Kn9pS8VSWdoDaUlGdSluau3upF6WrdKicCjPLdffs4+9Z8aktFU9laQeoLRVVotqiub9ERCRuFCoiIhI3CpWTNybsAuJIbal4Kks7QG2pqBLSFl1TERGRuFFPRURE4kahIiIicaNQKSMzW21mnwbPcSn+1H99M3vHzJYH3zPCrvN4jtKOX5nZ+hLPqTnyg6UVkpnVM7OXzOwzM1tqZr2S8ZzAUduSdOfFzNoe8cyjXWZ2b7Kdl2O0I+nOCYCZ/buZLTazRWb2gplVN7OWZjY7OCeTzSw1Lu+layplE8xPlu3uW0usewjY5u4PBg8hy3D3n4ZVY1kcpR2/Ava4++/DqutkmNmzwDR3Hxf8QtQE/pMkOydw1LbcSxKel2LBM5XWAz2BO0nC8wLfasfNJNk5MbMzgOlAe3ffb2ZTgDeJzUrysrtPMrMngYXu/udTfT/1VE5NX+DZ4OdngWtCrKVKMbM6xKbyeQrA3fPdfQdJeE6O0ZZkdzHwhbuvIQnPSwkl25GsokANM4sS+wfLRuBfgJeC7XE7JwqVsnPgbTOba2bDgnWN3H0jQPC9YWjVlV1p7QC4y2KPYn66og9NBFoBW4BnzGy+mY0zs1ok5zk5Wlsg+c5LSSVnGU/G81LsyNnSk+qcuPt64PfAWmJhshOYC+xw94Jgt2897+pkKVTK7nx37wZcDtxpZqXNnpwMSmvHn4HWxB7VvBH4Q4j1lVUU6Ab82d27AnuBn4Vb0kk7WluS8bwAEAzh9QFeDLuWU1FKO5LunATB1xdoCTQGahH7/T9SXK6FKFTKyN03BN83A68Qe1zyJjM7HSD4XuGn3C+tHe6+yd0L3b0IGEusbRVdHrFZrGcHyy8R+8OcdOeEo7QlSc9LscuBee6+KVhOxvMCR7QjSc/JJcAqd9/i7oeAl4HzgHrBcBh8+3lXJ02hUgZmVsvM0ot/Bv4VWARMBW4MdrsReDWcCsvmaO0o/mUP/BuxtlVo7v4lsM7M2garLgaWkGTnBI7elmQ8LyUc+cyjpDsvgW+0I0nPyVogx8xqmpnx9e/Ke0C/YJ+4nRPd/VUGZtaK2L/qITZU8by7/9bMGgBTgGbETlx/d98WUpnHdYx2TCDWnXdgNTC8ePy7IjOzLsA4IBVYSezOnAhJdE6KHaUtI0nO81ITWAe0Ch7GR7L9rsBR25Gsvyu/BgYCBcB84DZi11AmAfWDdUPj8cRchYqIiMSNhr9ERCRuFCoiIhI3ChUREYkbhYqIiMSNQkVEROJGoSIiInGjUBGpQMxsvJn1O/6eIhWTQkVEROJGoSJyHGbWInhw1tjgQUdvm1kNM3vfzLKDfTKDZ9VgZjeZ2d/M7DUzW2Vmd5nZj4IZiGeZWf0yvm93M/sgmFH6rRJzZ91uZnPMbKGZ/TWYfqOuxR7AFgn2qWlm68ysmpm1NrN/BMeZZmZnB/v0Dx7atNDMPkzIfzypchQqImXTBnjc3TsAO4DvH2f/jsB1xCYc/C2wL5iBeCZww/HezMyqAaOAfu7eHXg6OA7EHqx0rrt3BpYCtwbTiCwEvhPsczXwVjCB4Bjg7uA4PwaeCPb5JXBpcJw+x6tJpCyix99FRIjN8rog+Hku0OI4+7/n7ruB3Wa2E3gtWP8p0KkM79eWWDC9E5sDkBRiU60DdDSzB4B6QG3grWD9ZGLzO71H7BkgT5hZbWIz0r4YHAcgLfg+AxgfPAnw5TLUJHJcChWRsik50V4hUIPY5HzFvf3qx9i/qMRyEWX7vTNgsbv3KmXbeOAad19oZjcB3w3WTwV+FwyvdQf+SezZGTvcvcuRB3H3H5hZT+BKYIGZdXH3r8pQm8hRafhL5OStJvbHG76eQjxelgFZZtYLYsNhZtYh2JYObAyGyIYUv8Dd9wAfA48CrwfP/dgFrDKz/sFxzMw6Bz+3dvfZ7v5LYCvQNM5tkCpIoSJy8n4P/NDMPgIy43lgd88nFlT/a2YLgQXEhrEAfgHMBt4BPjvipZOBocH3YkOAW4PjLCb2FECAh83sUzNbBHxI7JqMyCnR1PciIhI36qmIiEjc6EK9SAjM7HHg/CNWP+ruz4RRj0i8aPhLRETiRsNfIiISNwoVERGJG4WKiIjEjUJFRETi5v8DpUPaQ/xhUpwAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "# plot CV误差曲线\n",
    "test_means = grid_search.cv_results_[ 'mean_test_score' ]\n",
    "test_stds = grid_search.cv_results_[ 'std_test_score' ]\n",
    "train_means = grid_search.cv_results_[ 'mean_train_score' ]\n",
    "train_stds = grid_search.cv_results_[ 'std_train_score' ]\n",
    "\n",
    "n_leafs = len(num_leaves_s)\n",
    "\n",
    "x_axis = num_leaves_s\n",
    "plt.plot(x_axis, -test_means)\n",
    "#plt.errorbar(x_axis, -test_means, yerr=test_stds,label = ' Test')\n",
    "#plt.errorbar(x_axis, -train_means, yerr=train_stds,label = ' Train')\n",
    "plt.xlabel( 'num_leaves' )\n",
    "plt.ylabel( 'Log Loss' )\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([-0.47815096, -0.47779211, -0.47768608, -0.47722504])"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "test_means"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 性能抖动，取系统推荐值：70, 不必再细调"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3. min_child_samples\n",
    "叶子节点的最小样本数目\n",
    "\n",
    "叶子节点数目：70，共9类，平均每类8个叶子节点\n",
    "每棵树的样本数目数目最少的类（稀有事件）的样本数目：200 * 2/3 * 0.7 = 100\n",
    "所以每个叶子节点约100/8 = 12个样本点\n",
    "\n",
    "搜索范围：10-50"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "pycharm": {
     "is_executing": false
    },
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Fitting 3 folds for each of 4 candidates, totalling 12 fits\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.\n",
      "[Parallel(n_jobs=4)]: Done   8 out of  12 | elapsed: 16.5min remaining:  8.2min\n",
      "[Parallel(n_jobs=4)]: Done  12 out of  12 | elapsed: 23.5min finished\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "LGBMClassifier(bagging_freq=1, boosting_type='gbdt', class_weight=None,\n",
       "               colsample_bytree=0.7, importance_type='split', learning_rate=0.1,\n",
       "               max_bin=127, max_depth=7, min_child_samples=40,\n",
       "               min_child_weight=0.001, min_split_gain=0.0, n_estimators=394,\n",
       "               n_jobs=4, num_class=9, num_leaves=70, objective='multiclass',\n",
       "               random_state=None, reg_alpha=0.0, reg_lambda=0.0, silent=False,\n",
       "               subsample=0.7, subsample_for_bin=200000, subsample_freq=0)"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "params = {'boosting_type': 'gbdt',\n",
    "          'objective': 'multiclass',\n",
    "          'num_class':9, \n",
    "          'n_jobs': 4,\n",
    "          'learning_rate': 0.1,\n",
    "          'n_estimators':n_estimators_1,\n",
    "          'max_depth': 7,\n",
    "          'num_leaves':70,\n",
    "          'max_bin': 127, #2^6,原始特征为整数，很少超过100\n",
    "          'subsample': 0.7,\n",
    "          'bagging_freq': 1,\n",
    "          'colsample_bytree': 0.7,\n",
    "         }\n",
    "lg = LGBMClassifier(silent=False,  **params)\n",
    "\n",
    "min_child_samples_s = range(10,50,10) \n",
    "tuned_parameters = dict( min_child_samples = min_child_samples_s)\n",
    "\n",
    "grid_search = GridSearchCV(lg, n_jobs=4,  param_grid=tuned_parameters, \n",
    "                           cv = kfold, scoring=\"neg_log_loss\", verbose=5, refit = True,return_train_score='warn')\n",
    "grid_search.fit(X_train , y_train)\n",
    "\n",
    "grid_search.best_estimator_"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "pycharm": {
     "is_executing": false
    },
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.47476563714086817\n",
      "{'min_child_samples': 40}\n"
     ]
    }
   ],
   "source": [
    "# examine the best model\n",
    "print(-grid_search.best_score_)\n",
    "print(grid_search.best_params_)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAD6CAYAAACoCZCsAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjAsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+17YcXAAAgAElEQVR4nO3deXxU9b3/8dcnO4GQCIQ1wYDikrIEGIEqeK3+vBeXYhdQqLW1rtUi3Kv2Vu/aq723rV28orS9VMUNQaxVKaXaRamghRJklcUiRQmgxCpRRJbA5/fHnAlDnJAJJJnt/Xw88kjOd86cfL6PA/PO+X7nzNfcHRERyTxZiS5AREQSQwEgIpKhFAAiIhlKASAikqEUACIiGUoBICKSoeIKADMba2YbzWyTmd12lP3Gm5mbWSjYzjWzh81sjZmtN7Pbo/Z90Mx2mtna4++GiIi0VE5zO5hZNjAdOB+oAZaZ2Tx3X9dovyJgCrA0qnkCkO/ug8ysEFhnZrPdfQvwEHAf8Ei8xXbr1s0rKiri3V1ERIDly5e/6+6ljdubDQBgBLDJ3TcDmNkc4BJgXaP97gTuAm6NanOgo5nlAB2A/cAHAO7+kplVtKQTFRUVVFdXt+QpIiIZz8zejNUezxBQH2Br1HZN0BZ98KFAubvPb/TcXwAfATuAt4Afuvt78RYdHPs6M6s2s+ra2tqWPFVERI4ingCwGG0Nnx9hZlnA3cAtMfYbARwEegP9gFvMrH9LCnT3Ge4ecvdQaeknrmBEROQYxTMEVAOUR22XAdujtouAgcBCMwPoCcwzs3HAl4Dn3P0AsNPMXgZCwOZWqF1ERI5DPFcAy4ABZtbPzPKAicC8yIPuXufu3dy9wt0rgCXAOHevJjzsc66FdQRGARtavRciItJizQaAu9cDk4HngfXAXHd/zczuCP7KP5rpQCdgLeEgmenuqwHMbDbwJ+BUM6sxs6uPox8iItJClkofBx0KhVzvAhIRaRkzW+7uocbtuhNYRCRDZUQAPL70LV56XW8hFRGJlvYBsL/+EI8teZNrH6lWCIiIREn7AMjLyeKxa0bSv7STQkBEJEraBwBAl455zApC4BqFgIgIkCEBAOEQePyakZykEBARATIoAABOUAiIiDTIqACAT4bAHxUCIpKhMi4A4HAInBxMDCsERCQTZWQAQDgEZikERCSDZWwAgEJARDJbRgcAfDIEFm7cmeiSRETaRcYHABwOgQHdO3Hdo8sVAiKSERQAAYWAiGQaBUCUkkKFgIhkDgVAIwoBEckUCoAYFAIikgkUAE1QCIhIulMAHEUkBE7p0YnrHlnOiwoBEUkjCoBmlBTm8djVIzmlZyeuVwiISBpRAMRBISAi6UgBEKdPhMAGhYCIpDYFQAuUFOYx6+pR4RB4VCEgIqlNAdBCxYW5CgERSQtxBYCZjTWzjWa2ycxuO8p+483MzSwUbOea2cNmtsbM1pvZ7S09ZjJqHAIvbHgn0SWJiLRYswFgZtnAdOACoBKYZGaVMfYrAqYAS6OaJwD57j4IGA5cb2YV8R4zmUVC4NSeRXz90VcVAiKScuK5AhgBbHL3ze6+H5gDXBJjvzuBu4C9UW0OdDSzHKADsB/4oAXHTGrFhbk8dvVIhYCIpKR4AqAPsDVquyZoa2BmQ4Fyd5/f6Lm/AD4CdgBvAT909/fiOWbUsa8zs2ozq66tTb4FWxQCIpKq4gkAi9HmDQ+aZQF3A7fE2G8EcBDoDfQDbjGz/s0d84hG9xnuHnL3UGlpaRzltr9ICJzWKxwCf1ivEBCR5BdPANQA5VHbZcD2qO0iYCCw0My2AKOAecFE8JeA59z9gLvvBF4GQnEcM+UUF+by6FXhELjhMYWAiCS/eAJgGTDAzPqZWR4wEZgXedDd69y9m7tXuHsFsAQY5+7VhId9zrWwjoTDYUNzx0xVCgERSSXNBoC71wOTgeeB9cBcd3/NzO4ws3HNPH060AlYS/hFf6a7r27qmMfRj6RRXJjLo5HhoMeWKwREJGmZe8yh96QUCoW8uro60WXEpe7jA1zxwFLW7/iAn315OOed3iPRJYlIhjKz5e4eatyuO4HbSHGH8JXA6b0660pARJKSAqANKQREJJkpANpYJAQqgxD4/TqFgIgkBwVAOyjukMsjQQjcMEshICLJQQHQThQCIpJsFADtqCEEehdzw6zl/E4hICIJpABoZ8UdcnnkqhFU9i7mRoWAiCSQAiABFAIikgwUAAkSfnfQ4RD47WtvJ7okEckwCoAE6lxwOAS+8firCgERaVcKgARTCIhIoigAkkAkBD6lEBCRdqQASBKdC3J5RCEgIu1IAZBEokPgxlkKARFpWwqAJBMJgYF9FAIi0rYUAEkoEgKDysIh8LxCQETagAIgSXUuyOXhq8Ih8A2FgIi0AQVAElMIiEhbUgAkuc4F4Y+NUAiISGtTAKSAokYh8NxahYCIHD8FQIqIDoHJjysEROT4KQBSSCQEBisERKQVKABSTFEwMXw4BHYkuiQRSVEKgBR0ZAisUAiIyDGJKwDMbKyZbTSzTWZ221H2G29mbmahYPtyM1sZ9XXIzKqCxy4zs9Vm9pqZ3dU63ckcCgEROV7NBoCZZQPTgQuASmCSmVXG2K8ImAIsjbS5+yx3r3L3KuAKYIu7rzSzrsAPgPPc/VNADzM7r1V6lEEiITCkvITJj6/gN2sUAiISv3iuAEYAm9x9s7vvB+YAl8TY707gLmBvE8eZBMwOfu4PvO7utcH274Evxl21NCgqyOWhr53BkPISbpqtEBCR+MUTAH2ArVHbNUFbAzMbCpS7+/yjHOcyDgfAJuA0M6swsxzgc0B5rCeZ2XVmVm1m1bW1tbF2yXgKARE5FvEEgMVo84YHzbKAu4FbmjyA2Uhgj7uvBXD394EbgCeARcAWoD7Wc919hruH3D1UWloaR7mZKXo4SCEgIvGIJwBqOPKv8zJge9R2ETAQWGhmW4BRwLzIRHBgIof/+gfA3X/l7iPd/dPARuAvLS9fonXKzzk8J6AQEJFmxBMAy4ABZtbPzPIIv5jPizzo7nXu3s3dK9y9AlgCjHP3ami4QphAeO6ggZl1D76fANwI3N8K/cl4kRCoUgiISDOaDQB3rwcmA88D64G57v6amd1hZuPi+B1nAzXuvrlR+z1mtg54Gfieu7/ewtqlCZEQGKoQEJGjMHdvfq8kEQqFvLq6OtFlpIzd++q58sE/s2LrLu6dNJQLB/VKdEkikgBmttzdQ43bdSdwGuuUn8NDwZXATbNXsEBXAiISRQGQ5iIhMKyvQkBEjqQAyACd8nOY+TWFgIgcSQGQIRqHwK9XKwREMp0CIINEh8CUOQoBkUynAMgwCgERiVAAZKBO+Tk8FBUC81dvb/5JIpJ2FAAZqmNUCEyds1IhIJKBFAAZLBICw/ueoBAQyUAKgAzXMT+HmV87oyEEfrVKISCSKRQAckQI/OMTCgGRTKEAECAqBE5UCIhkCgWANOiYn8PMKxUCIplCASBHUAiIZA4FgHxCdAhMnbNCISCSphQAElMkBEIVXRQCImlKASBNCt8ncDgE5ikERNKKAkCOqjDvcAj8o0JAJK0oAKRZkRA4QyEgklYUABKXwrzwfQIKAZH0oQCQuDUOgWdXbkt0SSJyHBQA0iKREBjRrwv/9MRKhYBIClMASIsV5uXw4JUKAZFUpwCQY6IQEEl9cQWAmY01s41mtsnMbjvKfuPNzM0sFGxfbmYro74OmVlV8NgkM1tjZqvN7Dkz69Y6XZL2EgmBkf26KgREUlCzAWBm2cB04AKgEphkZpUx9isCpgBLI23uPsvdq9y9CrgC2OLuK80sB7gH+Iy7DwZWA5Nbo0PSvgrzcnjgypBCQCQFxXMFMALY5O6b3X0/MAe4JMZ+dwJ3AXubOM4kYHbwswVfHc3MgM6A3leYohqHwDMrFAIiqSCeAOgDbI3argnaGpjZUKDc3ecf5TiXEQSAux8AbgDWEH7hrwQeiPUkM7vOzKrNrLq2tjaOciURokPg5rkKAZFUEE8AWIw2b3jQLAu4G7ilyQOYjQT2uPvaYDuXcAAMBXoTHgK6PdZz3X2Gu4fcPVRaWhpHuZIo0XMCCgGR5BdPANQA5VHbZRw5XFMEDAQWmtkWYBQwLzIRHJjI4eEfgCoAd3/D3R2YC5zZ4uol6XTIy1YIiKSIeAJgGTDAzPqZWR7hF/N5kQfdvc7du7l7hbtXAEuAce5eDQ1XCBMIzx1EbAMqzSzyJ/35wPrj7o0khcYh8PSKmkSXJCIxNBsA7l5P+B06zxN+kZ7r7q+Z2R1mNi6O33E2UOPum6OOuR34L+AlM1tN+Irgf46lA5KcIiEwqn9Xbpm7SiEgkoQsPAKTGkKhkFdXVye6DGmBj/cf5OqHl7Fk89/40aVD+PzQskSXJJJxzGy5u4cat+tOYGlTHfKyeeCruhIQSUYKAGlzkRD49ElduVkhIJI0FADSLjrkZXP/V87gTIWASNJQAEi7UQiIJBcFgLSrxiHwg+c3UH/wUKLLEslICgBpd5E5gUuHlzP9xTeY9PMlbN/1caLLEsk4CgBJiILcbL4/fjD3TKxi3fYPuHDaIn6/7p1ElyWSURQAklCXVPVh/pQx9CnpwDWPVHPn/HXsr9eQkEh7UABIwvXr1pFf3ngmV55ZwQOL/8r4n73CW3/bk+iyRNKeAkCSQn5ONt8e9yl+9uVhbHn3Iy6atoj5q7VEhEhbUgBIUhk7sBe/njKGk7p3YvLjK/jXp9ew98DBRJclkpYUAJJ0yrsU8uTXP831f9efWUvf4nPTX2bTzt2JLksk7SgAJCnlZmdx+wWnM/NrZ7Dzw3189t7FPLVcN46JtCYFgCS1z5zanQVTxjC4rJhbnlzFzXNX8tG++kSXJZIWFACS9HoWF/D4taOYet4Anl6xjc/et5j1Oz5IdFkiKU8BICkhO8v4p/NPYdY1I9m9t55Lpr/MY0veJJXWsxBJNgoASSlnntSNBVPHMKp/V/7tmbVMfnwFH+w9kOiyRFKSAkBSTrdO+Tx05Rl8a+xpPPfa21w8bTGrtu5KdFkiKUcBICkpK8u44ZyTmHv9KA4ecsb/7BXuX7RZQ0IiLaAAkJQ2/MQu/HrKaM45tTvf+fV6rn2kmvc/2p/oskRSggJAUl5JYR4zrhjOtz9byUuvv8uF0xaxbMt7iS5LJOkpACQtmBlXntWPp244k7ycLCbOWML0Fzdx6JCGhESaogCQtDKorJj5N43mwkG9+MHzG/nqzD9T++G+RJclkpQUAJJ2igpymTaxiu99YRB//ut7XHDPIhb/5d1ElyWSdOIKADMba2YbzWyTmd12lP3Gm5mbWSjYvtzMVkZ9HTKzKjMratT+rpn9b2t1SsTMmDiiL/Mmj6akMJcrHlzKj367UesPi0RpNgDMLBuYDlwAVAKTzKwyxn5FwBRgaaTN3We5e5W7VwFXAFvcfaW7fxhpDx57E/hl63RJ5LBTexYxb/JZTBhexr0vbOJLP1/KjjqtPywC8V0BjAA2uftmd98PzAEuibHfncBdwN4mjjMJmN240cwGAN2BRXFVLNJChXk53DV+CHdfNoS12+u48J5FvLBB6w+LxBMAfYCtUds1QVsDMxsKlLv7/KMc5zJiBADhYHjCm7iDx8yuM7NqM6uura2No1yR2D4/tIz5N42mV3EHrnqomu9o/WHJcPEEgMVoa3ixNrMs4G7gliYPYDYS2OPua2M8PJHYwRD+Re4z3D3k7qHS0tI4yhVpWv/STvzyxjP5yqdP5P7Ff2XCz15h63taf1gyUzwBUAOUR22XAdGLtRYBA4GFZrYFGAXMi0wEB2K+yJvZECDH3Ze3sG6RY1aQm80dlwzkZ18exuZ3P+LCaYtYsGZHossSaXfxBMAyYICZ9TOzPMIv5vMiD7p7nbt3c/cKd68AlgDj3L0aGq4QJhCeO2gs5ryASHsYO7AXC6aM4aTSTtw461X+7RmtPyyZpdkAcPd6YDLwPLAemOvur5nZHWY2Lo7fcTZQ4+6bYzx2KQoASaCG9YfP7s9jS97i8z95hTdqtf6wZAZLpU9PDIVCXl1dnegyJE29uGEnN89dyb76Q3zncwP5wrCyRJck0irMbLm7hxq3605gkcBnTuvOb6aezcA+xdw8dxW3PrmKPfu1/rCkLwWASJSexQU8fs1Ippw3gKdereGz9y5mw9taf1jSkwJApJGc7CxuPv8UZl09kg/21nPJfS/z+NK3tNiMpB0FgEgTzjy5GwumjGFEvy78y9NruGn2Cj7U+sOSRhQAIkdRWpTPw18bwT+PPZXfrH2bi6YtZnWN1h+W9KAAEGlGVpZx4zkn88R1o6g/eIgv/vQVHlz8Vw0JScpTAIjEKVTRhQVTx/B3p3TnjvnruPaR5ezao/WHJXUpAERaoKQwj59/ZTj/cXElf3x9Jxfes4hqrT8sKUoBINJCZsZVo8PrD+dkZ3GZ1h+WFKUAEDlGg8tKmD9lNBcM7Kn1hyUlKQBEjkPnglzunTSU7wbrD184bRGvbNL6w5IaFAAix8nMmDSiL89OPovOBTlc/sBSfqz1hyUFKABEWslpPTvzq5tGM35YGdNe2MSX7l/K23VNrZAqkngKAJFWVJiXww8mDOHHlw5h7bY6LrjnJa0/LElLASDSBr4wrIxf3TSansH6w/+zYL3WH5akowAQaSMnlXbi6RvP5IpRJzLjpc1M+L8/af1hSSoKAJE2VJCbzZ2fG8hPLh/G5p27uXDaIn6j9YclSSgARNrBhYN6sWDqGPqXduKGWa/y78+s1frDknAKAJF2Ut6lkCev/zTXjunHo0ve5PM/eYXNWn9YEkgBINKO8nKy+NeLKnnwyhBv133Mxfcu5ukVNYkuSzKUAkAkAc49rQcLpo5hYO9i/umJVXxT6w9LAigARBKkV3EHHr92JFPOPZlfvFrDuPteZuPbHya6LMkgCgCRBMrJzuLmvz+Vx64eya49Bxh332Jm/1nrD0v7UACIJIGzTu7Gb6aG1x++/ZdrmDJnpdYfljYXVwCY2Vgz22hmm8zstqPsN97M3MxCwfblZrYy6uuQmVUFj+WZ2Qwze93MNpjZF1unSyKpKbL+8Df/4VQWrNnBxfcuZk1NXaLLkjTWbACYWTYwHbgAqAQmmVlljP2KgCnA0kibu89y9yp3rwKuALa4+8rg4X8Fdrr7KcFx/3i8nRFJdVlZxjc+E15/eH/9Ib7w05eZ+bLWH5a2Ec8VwAhgk7tvdvf9wBzgkhj73QncBTT18YeTgNlR21cB3wVw90Purg9RFwmEKrqwYMoY/u6UUv7rV+u47lGtPyytL54A6ANsjdquCdoamNlQoNzd5x/lOJcRBICZlQRtd5rZq2b2pJn1iPUkM7vOzKrNrLq2tjaOckXSwwkd8/j5V0L8+8WVLNy4k4umLWb5m1p/WFpPPAFgMdoarkfNLAu4G7ilyQOYjQT2uPvaoCkHKANedvdhwJ+AH8Z6rrvPcPeQu4dKS0vjKFckfZgZVwfrD2dnGZf+3xJ+uvANrT8srSKeAKgByqO2y4DtUdtFwEBgoZltAUYB8yITwYGJHDn88zdgD/B0sP0kMKxFlYtkkMj6w2MH9uT7z23gyoeW8e5urT8sxyeeAFgGDDCzfmaWR/jFfF7kQXevc/du7l7h7hXAEmCcu1dDwxXCBMJzB5HnOPAr4Jyg6Txg3fF3RyR9dS7I5b5JQ/nvzw9k6ea/ccE9i3jlDU2dybFrNgDcvR6YDDwPrAfmuvtrZnaHmY2L43ecDdS4++ZG7d8Cvm1mqwm/Q6jJISQRCTMzLh95Is98I1h/+P6l/Ph3r3NQQ0JyDCyV3l4WCoW8uro60WWIJIWP9tXzH8++xlOv1jCyXxfumTiUnsUFiS5LkpCZLXf3UON23QkskqI65ufwo0uH8KMJQ1izrY4Lpy3ixY07E12WpBAFgEiK++LwMuZNHk33ony+NnMZ312wngMHtf6wNE8BIJIGTu7eiWe+cRZfHtWX/3tpM5dq/WGJgwJAJE0U5Gbznc8NYvqXhrHpnd1cNG0Rz63V+sPSNAWASJq5aHAvfj1lDP26deTrj73Kfz6r9YclNgWASBrq27WQJ79+JteM7sfDf3qTL/70Ff767keJLkuSjAJAJE3l5WTxbxdX8sBXQ2zb9TEXT1vEsyu3JbosSSIKAJE0d97pPVgwZQyVvTszdc5KvvWL1Xy8X0NCogAQyQi9Szow+9pRTP7MycxdvpVx9y3m9Xe0/nCmUwCIZIic7Cxu/YdTefSqkbwfrD/8xDKtP5zJFAAiGWb0gG4smDqa4SeewLeeWsNUrT+csRQAIhmoe1EBj1w1klv//hTmr97OZ+9dzNptWn840ygARDJUdpYx+dwBPHH9p9lXf4gv/OQVHtL6wxlFnwYqIrz/0X5ufXIVf9iwk97FBVT1LWFwWQlDykoYVFZMp/ycRJcox6GpTwPVWRURTuiYx/1fDfGL5TX88fVaVtfUsWDN2wCYwcmlncKBUF7MkLISTutVRH5OdoKrluOlKwARiem9j/azumYXq7bWhb/X7OLd3fsByMvO4vReRQwuK2FwWTFV5SX0L+1EdlasJcQl0Zq6AlAAiEhc3J3tdXtZvXUXK2t2sXprHWu21bF7Xz0AHfOyGVQWvkKIXC30KemAmUIh0TQEJCLHxczoU9KBPiUduGBQLwAOHXI2v7ubVVvrWFWzi1U1dcx8eQv7g/UIunbMY0h5+CphSHC10LVTfiK7IVEUACJyzLKyjJO7F3Fy9yK+OLwMgP31h9jw9gesqqlj1dZdrK7ZxYsbdxIZbCg7oQNDyksYUlbM4LISBvUppqMmmRNCQ0Ai0uZ276tn7ba6hjmFVTW7qHn/YwCyLLygzZCyEgaXl1BVVsKpPYvIy9G71FuLhoBEJGE65ecwqn9XRvXv2tD2t937WF0TDB1t3cULG3by5PIaIJhk7t2ZquAqYUh5Cf27dSRLk8ytSlcAIpIU3J2a9z8+IhTWbqvjo+CTS4vycxjYp7hh+GhIeQm9igs0yRwHXQGISFIzM8q7FFLepZCLBocnmQ8ect6o3c2qreG3oa6uqeOBxZs5cDD8h2u3TvkNYRCZaD6hY14iu5FSFAAikrSys4xTehRxSo8iJoTKAdhXf5D1Oz5kdc0uVm4Nh8ILUZPMfbsUHjHJPLBPZwrz9FIXS1xDQGY2FrgHyAbud/fvNbHfeOBJ4Ax3rzazy4FvRu0yGBjm7ivNbCHQC/g4eOzv3X3n0erQEJCIxPLh3gOs2VYXHj4KQmHbrsOTzKf0KAommcNXCaf2LCI3O3MmmY/5RjAzywZeB84HaoBlwCR3X9dovyLg10AeMNndqxs9Pgh41t37B9sLgVsb73c0CgARiVfth/uCO5gPvx31/T3hj73Oz8misndnhgQ3rA0uK6Ff1/SdZD6eOYARwCZ33xwcaA5wCbCu0X53AncBtzZxnEnA7LgrFhE5DqVF+Zx3eg/OO70HEJ5k3vrex8FcQvjtqE8s28pDr2wBoKggJ+qGtXAw9Oyc3pPM8QRAH2Br1HYNMDJ6BzMbCpS7+3wzayoALiMcHNFmmtlB4CngOx7jcsTMrgOuA+jbt28c5YqIfJKZ0bdrIX27FvLZIb0BqD94iE21u1ndcCfzLma8tJn6Q+GXou5F+QwuK6EquEoYXFZMSWH6TDLHEwCx4q/hhdrMsoC7gSubPIDZSGCPu6+Nar7c3bcFQ0dPAVcAj3ziF7nPAGZAeAgojnpFROKSk53FaT07c1rPzlx6RniSee+Bg6zb8QGrtwbDRzW7+P36dxqeU9G1sOHehCFlxXyqdzEd8lLzk1HjCYAaoDxquwzYHrVdBAwEFgaXSj2BeWY2Lmp8fyKNhn/cfVvw/UMze5zwUNMnAkBEpD0V5GYzrO8JDOt7QkNb3ccHWLvt8P0Jy7a8x7xV4ZfByDuVot+OekqP1JhkjicAlgEDzKwfsI3wi/mXIg+6ex3QLbLdeHI3uEKYAJwdtU8OUOLu75pZLnAx8Pvj7o2ISBso7pDLWSd346yTG17q2PnBXlbV1DW8HfU3a99mzrLwaHl+ThYD+xQ3fFT24LISKroWJt18QrMB4O71ZjYZeJ7w20AfdPfXzOwOoNrd5zVziLOBmsgkciAfeD548c8m/OL/82PqgYhIAnTvXMD5lQWcX3l4kvmt9/Y03JuwausuZv/5LWa+vAUIh8jgsuKGieYh5SX06FyQwB7ooyBERNpM/cFDvP7O7iPejrrxnQ85GEwy9+xcEA6E8sPLbxZ3yG31OvRRECIi7SwnO3y/QWXvzkwcEW77eP9B1u2oi1pprY7frjs8ydy/W8fgSiF8lfCp3p0pyG2bSWYFgIhIO+qQl83wE7sw/MQuDW11ew6wetvhoaM/bf4bz6wMTzLnZBmn9ixi1jUjW/0tqAoAEZEEKy7MZcyAUsYMKG1oe7tub8NNa5t27m6ToSEFgIhIEupZXEDP4p78w6d6ttnvSP43qoqISJtQAIiIZCgFgIhIhlIAiIhkKAWAiEiGUgCIiGQoBYCISIZSAIiIZKiU+jA4M6sF3jzGp3cD3m3FchIpXfqSLv0A9SVZpUtfjrcfJ7p7aePGlAqA42Fm1bE+DS8VpUtf0qUfoL4kq3TpS1v1Q0NAIiIZSgEgIpKhMikAZiS6gFaULn1Jl36A+pKs0qUvbdKPjJkDEBGRI2XSFYCIiERRAIiIZKi0DAAze9DMdprZ2qi2Lmb2OzP7S/D9hETWGI8m+vFtM9tmZiuDrwsTWWO8zKzczF40s/Vm9pqZTQ3aU+q8HKUfKXdezKzAzP5sZquCvvxX0N7PzJYG5+QJM2vddQjbwFH68pCZ/TXqvFQlutZ4mFm2ma0ws/nBdpuck7QMAOAhYGyjttuAP7j7AOAPwXaye4hP9gPgbnevCr4WtHNNx6oeuMXdTwdGAd8ws0pS77w01Q9IvfOyDzjX3YcAVcBYMxsFfJ9wXwYA7wNXJ7DGeDXVF4BvRp2XlYkrsUWmAuujttvknKRlALj7S8B7jZovAR4Ofn4Y+Fy7FnUMmuhHSnL3He7+avDzh4T/cfchxc7LUVOs6GAAAAJcSURBVPqRcjxsd7CZG3w5cC7wi6A96c8JHLUvKcfMyoCLgPuDbaONzklaBkATerj7Dgj/Jwa6J7ie4zHZzFYHQ0RJPWQSi5lVAEOBpaTweWnUD0jB8xIMNawEdgK/A94Adrl7fbBLDSkScI374u6R8/LfwXm528zyE1hivP4X+GfgULDdlTY6J5kUAOnip8BJhC9zdwA/Smw5LWNmnYCngH909w8SXc+xitGPlDwv7n7Q3auAMmAEcHqs3dq3qmPTuC9mNhC4HTgNOAPoAnwrgSU2y8wuBna6+/Lo5hi7tso5yaQAeMfMegEE33cmuJ5j4u7vBP/QDwE/J/yfNiWYWS7hF81Z7v7LoDnlzkusfqTyeQFw913AQsLzGiVmlhM8VAZsT1RdxyKqL2ODITt3933ATJL/vJwFjDOzLcAcwkM//0sbnZNMCoB5wFeDn78KPJvAWo5Z5MUy8HlgbVP7JpNgHPMBYL27/zjqoZQ6L031IxXPi5mVmllJ8HMH4P8RntN4ERgf7Jb05wSa7MuGqD8ujPC4eVKfF3e/3d3L3L0CmAi84O6X00bnJC3vBDaz2cA5hD9C9R3gP4FngLlAX+AtYIK7J/UEaxP9OIfwMIMDW4DrI2PoyczMRgOLgDUcHtv8F8Lj5ylzXo7Sj0mk2Hkxs8GEJxSzCf8xONfd7zCz/oT/+uwCrAC+HPwFnbSO0pcXgFLCwygrga9HTRYnNTM7B7jV3S9uq3OSlgEgIiLNy6QhIBERiaIAEBHJUAoAEZEMpQAQEclQCgARkQylABARyVAKABGRDPX/AW5wbTNnuIaIAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "# plot CV误差曲线\n",
    "test_means = grid_search.cv_results_[ 'mean_test_score' ]\n",
    "test_stds = grid_search.cv_results_[ 'std_test_score' ]\n",
    "train_means = grid_search.cv_results_[ 'mean_train_score' ]\n",
    "train_stds = grid_search.cv_results_[ 'std_train_score' ]\n",
    "\n",
    "x_axis = min_child_samples_s\n",
    "\n",
    "plt.plot(x_axis, -test_means)\n",
    "#plt.errorbar(x_axis, -test_scores, yerr=test_stds ,label = ' Test')\n",
    "#plt.errorbar(x_axis, -train_scores, yerr=train_stds,label =  +' Train')\n",
    "\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "pycharm": {
     "is_executing": false
    },
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([-0.48075808, -0.47768608, -0.4754073 , -0.47476564])"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "test_means"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### min_child_samples=30"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 行采样参数 sub_samples/bagging_fraction"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Fitting 3 folds for each of 5 candidates, totalling 15 fits\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.\n",
      "[Parallel(n_jobs=4)]: Done  12 out of  15 | elapsed: 22.2min remaining:  5.6min\n",
      "[Parallel(n_jobs=4)]: Done  15 out of  15 | elapsed: 27.1min finished\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "GridSearchCV(cv=StratifiedKFold(n_splits=3, random_state=3, shuffle=True),\n",
       "             error_score='raise-deprecating',\n",
       "             estimator=LGBMClassifier(bagging_freq=1, boosting_type='gbdt',\n",
       "                                      class_weight=None, colsample_bytree=0.7,\n",
       "                                      importance_type='split',\n",
       "                                      learning_rate=0.1, max_bin=127,\n",
       "                                      max_depth=7, min_child_samples=30,\n",
       "                                      min_child_weight=0.001,\n",
       "                                      min_split_gain=0.0, n_estimators=394,\n",
       "                                      n_jobs=4, num_class=9, num_leaves=70,\n",
       "                                      objective='multiclass', random_state=None,\n",
       "                                      reg_alpha=0.0, reg_lambda=0.0,\n",
       "                                      silent=False, subsample=1.0,\n",
       "                                      subsample_for_bin=200000,\n",
       "                                      subsample_freq=0),\n",
       "             iid='warn', n_jobs=4,\n",
       "             param_grid={'subsample': [0.5, 0.6, 0.7, 0.8, 0.9]},\n",
       "             pre_dispatch='2*n_jobs', refit=False, return_train_score='warn',\n",
       "             scoring='neg_log_loss', verbose=5)"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "params = {'boosting_type': 'gbdt',\n",
    "          'objective': 'multiclass',\n",
    "          'num_class':9, \n",
    "          'n_jobs': 4,\n",
    "          'learning_rate': 0.1,\n",
    "          'n_estimators':n_estimators_1,\n",
    "          'max_depth': 7,\n",
    "          'num_leaves':70,\n",
    "          'min_child_samples':30,\n",
    "          'max_bin': 127, #2^6,原始特征为整数，很少超过100\n",
    "          #'subsample': 0.7,\n",
    "          'bagging_freq': 1,\n",
    "          'colsample_bytree': 0.7,\n",
    "         }\n",
    "lg = LGBMClassifier(silent=False,  **params)\n",
    "\n",
    "\n",
    "subsample_s = [i/10.0 for i in range(5,10)]\n",
    "tuned_parameters = dict( subsample = subsample_s)\n",
    "\n",
    "grid_search = GridSearchCV(lg, n_jobs=4,  param_grid=tuned_parameters, cv = kfold, scoring=\"neg_log_loss\", verbose=5, refit = False,return_train_score='warn')\n",
    "grid_search.fit(X_train , y_train)\n",
    "#grid_search.best_estimator_"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.4754072980958521\n",
      "{'subsample': 0.7}\n"
     ]
    }
   ],
   "source": [
    "# examine the best model\n",
    "print(-grid_search.best_score_)\n",
    "print(grid_search.best_params_)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYcAAAD4CAYAAAAHHSreAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjAsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+17YcXAAAgAElEQVR4nO3dd3xV9f3H8dcnkxVmwpAVdggRg4T5s46iFaoyFAVEC4hibSlttUOrdYCrtpXa1sFSURFE60BqpS6sg5UoIGGGABJm2Akh+/P7456EC1k3kOTcm3yej8d9cM/+nGu873vO95zvEVXFGGOM8RbkdgHGGGP8j4WDMcaYEiwcjDHGlGDhYIwxpgQLB2OMMSWEuF1AVYiMjNTo6Gi3yzDGmICSlJR0SFWjSptWK8IhOjqaxMREt8swxpiAIiK7yppmp5WMMcaUYOFgjDGmBAsHY4wxJVg4GGOMKcHCwRhjTAkWDsYYY0qwcDDGGFNCnQ6HLfszePI/m7Fuy40x5kx1Ohy+SjnEC59v54Pv9rtdijHG+BWfwkFEhorIFhFJEZF7y5lvtIioiCQ4w+NFZK3Xq1BE4p1pY0RkvYgki8hTXusIF5E3nG2tEpHo89vFsv1kUEd6XdCY6UuTycjOq67NGGNMwKkwHEQkGHgWGAbEAuNEJLaU+SKAacCqonGqukBV41U1HrgV2Kmqa0WkBfBnYIiq9gJaicgQZ7HJwFFV7QrMBP50XntYjpDgIB4bdSEHM3KY+dG26tqMMcYEHF+OHPoDKaqaqqq5wCJgRCnzzQCeArLLWM84YKHzvjOwVVXTneGPgRuc9yOA+c77t4AhIiI+1HlO4ts35eb+HXj56x1s2HO8ujZjjDEBxZdwaAvs9hpOc8YVE5E+QHtVXVrOesZwOhxSgBgRiRaREGAk0P7s7alqPnAcaHH2ykRkiogkikhienr62ZMr5XdXx9C8YRgPvLuBwkJrnDbGGF/CobRf7cXfoCIShOf0zz1lrkBkAJClqhsAVPUocBfwBvAFsBPI92V7xSNUZ6tqgqomREWV2uOsz5o0COX+a3qydvcxFq75/rzWZYwxtYEv4ZDG6V/1AO2AvV7DEUAcsFxEdgIDgSVFjdKOsZw+agBAVd9X1QGqOgjYAhSd9C/ennNU0QQ44usOnauR8W0Z1LkFf/rPZg5l5lT35owxxq/5Eg5rgG4i0klEwvB80S8pmqiqx1U1UlWjVTUaWAkMV9VEKD6yuBFPW0UxEWnp/NsM+Bkw15m0BJjgvB8NfKo1cCOCiDBjZByn8gp4/INN1b05Y4zxaxWGg3PefyqwDNgELFbVZBGZLiLDfdjGpUCaqqaeNf4ZEdkIfAU8qapbnfHzgBYikgLcDZR56WxV69qyEXde2oW3v9nDiu2Ha2qzxhjjd6Q23B2ckJCgVfUkuOy8Aq6a+TnhIcF8MO0HhIXU6fsEjTG1mIgkqWpCadPsm+8s9UKDmT48jpSDmcz54uyDHWOMqRssHEpxRUxLhvZqzd8/2cbuI1lul2OMMTXOwqEMDw2PJSRIeGhJsnXMZ4ypcywcytCmSX1+fVV3Pt18kGXJB9wuxxhjapSFQzkmDo4mpnUEj7yfzMmc/IoXMMaYWsLCoRyejvni2Hc8m799vLXiBYwxppawcKhA347NGde/PS9+tZNN+064XY4xxtQICwcf/H5oDE3qh1rHfMaYOsPCwQdNG4Rx37AYknYd5c2k3RUvYIwxAc7CwUej+7ajf3RznvjPZo6czHW7HGOMqVYWDj4SER4dFUdmdj5PWMd8xphazsKhErq3iuD2H3TmzaQ01uys9l7EjTHGNRYOlTRtSFfaNq3P/e98R15BodvlGGNMtbBwqKQGYSE8MrwXWw9kMu/LHW6XY4wx1cLC4RxcGduKq2Jb8czH20g7ah3zGWNqHwuHc/Tw8F4APPL+RpcrMcaYqmfhcI7aNq3PL6/sxkcbD/DRRuuYzxhTu1g4nIfJl3Sie6tGPLwkmaxc65jPGFN7+BQOIjJURLaISIqIlPlMZxEZLSIqIgnO8HgRWev1KhSReGfaOBH5TkTWi8iHIhLpjH9YRPZ4LfPjqtjR6hAaHMRjoy5kz7FT/P2TFLfLMcaYKlNhOIhIMPAsMAyIBcaJSGwp80UA04BVReNUdYGqxqtqPHArsFNV14pICPAMcIWq9gbWA1O9VjezaDlV/eA89q/a9Ytuzo192zH3i1S2HshwuxxjjKkSvhw59AdSVDVVVXOBRcCIUuabATwFZJexnnHAQue9OK+GIiJAY2BvZQr3J/f9uCeN6oXwwDsb7KlxxphawZdwaAt49zaX5owrJiJ9gPaqurSc9YzBCQdVzQPuAr7DEwqxwDyveac6p5teFJFmpa1MRKaISKKIJKanp/uwG9WneUNPx3yrdx7hraQ0V2sxxpiq4Es4SCnjin8ei0gQMBO4p8wViAwAslR1gzMciicc+gAX4DmtdJ8z+/NAFyAe2Af8tbR1qupsVU1Q1YSoqCgfdqN63di3PX07NuOJ/2zmqHXMZ4wJcL6EQxrQ3mu4HWeeAooA4oDlIrITGAgsKWqUdozl9Ckl8Hzxo6rb1XMeZjEw2Bl3QFULVLUQmIPntJbfCwoSHh0Zx/FTeTy1bLPb5RhjzHnxJRzWAN1EpJOIhOH5ol9SNFFVj6tqpKpGq2o0sBIYrqqJUHxkcSOetooie4BYESn6yX8VsMmZv43XfKOADee0Zy7o2aYxky/pxMLVu0naZR3zGWMCV4XhoKr5eK4kWobnC3yxqiaLyHQRGe7DNi4F0lQ11Wude4FHgP+JyHo8RxKPO5OfKrrEFbgC+HWl9shlvxzSjQua1OP+dzaQbx3zGWMClNSGq2sSEhI0MTHR7TKKfbhhPz99LYkHrunJ7T/o7HY5xhhTKhFJUtWE0qbZHdLV4OperfhhTEue/mgre4+dcrscY4ypNAuHaiAiPDK8F4WqTLeO+YwxAcjCoZq0b96AX/ywGx8m7+ezzQfdLscYYyrFwqEa3fGDznRt2YgHl2zgVG6B2+UYY4zPLByqUVhIEDNGxLH7yCn++dk2t8sxxhifWThUs0FdWnD9xW2Z/b9UUg5ax3zGmMBg4VAD/vDjnjQIC+GBd61jPmNMYLBwqAGRjcL53dAerEw9wrtr97hdjjHGVMjCoYaM69eB+PZNeXTpJo5n5bldjjHGlMvCoYYEBQmPjYrjaFaudcxnjPF7Fg41qNcFTZg4uBOvr/6etbuPuV2OMcaUycKhht39o+60jAjn/ne+s475jDF+y8KhhjUKD+Gh63qRvPcEr6zY5XY5xhhTKgsHFwyLa81l3aN4+qOtHDhR1iO3jTHGPRYOLhARpo/oRV5BIdOXWsd8xhj/Y+Hgko4tGvLzK7ry7/X7+HxrutvlGGPMGSwcXHTnZZ3pHNmQB9/bQHaedcxnjPEfFg4uCg8J5tGRcew6nMVzy7e7XY4xxhTzKRxEZKiIbBGRFBG5t5z5RouIikiCMzxeRNZ6vQpFJN6ZNq7oWdEi8qGIRDrjm4vIRyKyzfm3WVXsqL8a3DWSEfEX8MLy7aSmZ7pdjjHGAD6Eg4gEA88Cw4BYYJyIxJYyXwQwDVhVNE5VF6hqvKrGA7cCO1V1rYiEAM8AV6hqb2A9MNVZ7F7gE1XtBnziDNdq91/Tk/DQIP74nnXMZ4zxD74cOfQHUlQ1VVVzgUXAiFLmmwE8BZR1beY4YKHzXpxXQxERoDGw15k2ApjvvJ8PjPShxoDWMqIev7u6B1+lHGbJur0VL2CMMdXMl3BoC+z2Gk5zxhUTkT5Ae1VdWs56xuCEg6rmAXcB3+EJhVhgnjNfK1Xd58y3D2hZ2spEZIqIJIpIYnp64F/tc/OAjvRu14RH/72JE9nWMZ8xxl2+hIOUMq743IeIBAEzgXvKXIHIACBLVTc4w6F4wqEPcAGe00r3+V42qOpsVU1Q1YSoqKjKLOqXgoOEx0ZeyOHMHP66bIvb5Rhj6jhfwiENaO813I7Tp4AAIoA4YLmI7AQGAkuKGqUdYzl9SgkgHkBVt6vnJPtiYLAz7YCItAFw/j3o894EuAvbNeEng6J5ZeUu1qdZx3zGGPf4Eg5rgG4i0klEwvB80S8pmqiqx1U1UlWjVTUaWAkMV9VEKD6yuBFPW0WRPUCsiBT95L8K2OS8XwJMcN5PAN47pz0LUHf/qDuRjcK5/50NFBRa47Qxxh0VhoOq5uO5kmgZni/wxaqaLCLTRWS4D9u4FEhT1VSvde4FHgH+JyLr8RxJPO5MfhK4SkS24QmNJyuzQ4Gucb1Q/nhtLN/tOc6CVdYxnzHGHVIbLp1MSEjQxMREt8uoMqrKrfNWs273MT75zWW0jKjndknGmFpIRJJUNaG0aXaHtB8SEWaMjCOnoJBHl26qeAFjjKliFg5+qlNkQ+66rAtL1u3ly22H3C7HGFPHWDj4sbsu70J0iwb80TrmM8bUMAsHP1YvNJjpI+LYcegksz5PrXgBY4ypIhYOfu7S7lFc27sNzy5PYeehk26XY4ypIywcAsAfr40lLDiIB5ckW8d8xpgaYeEQAFo1rsc9P+rO/7am88F3+90uxxhTB1g4BIhbB3ak1wWNeeT9ZDKsYz5jTDWzcAgQIcFBPDbqQtIzc3j6o61ul2OMqeUsHAJIfPumjB/Qgflf72TDnuNul2OMqcUsHALMb6+OoXnDMO5/1zrmM8ZUHwuHANOkfigPXBPLut3HWLj6e7fLMcbUUhYOAWhE/AUM7tKCpz7cTHpGjtvlGGNqIQuHACQiTB8Rx6m8Ap74wDrmM8ZUPQuHANW1ZSPuvLQLb3+7h6+3W8d8xpiqZeEQwKb+sCsdmjfgj+9uIDe/0O1yjDG1iIVDAKsXGswjI3qxPf0kc76wjvmMMVXHwiHAXdGjJcPiWvP3T7bx/eEst8sxxtQSPoWDiAwVkS0ikiIi95Yz32gRURFJcIbHi8har1ehiMSLSMRZ4w+JyN+cZSaKSLrXtNurZldrrweviyUkSHhoyQbrmM8YUyUqDAcRCQaeBYYBscA4EYktZb4IYBqwqmicqi5Q1XhVjQduBXaq6lpVzSga70zbBbzttbo3vKbPPa89rAPaNKnPr6/qzmdb0lmWbB3zGWPOny9HDv2BFFVNVdVcYBEwopT5ZgBPAdllrGccsPDskSLSDWgJfOFTxaZUEwdH07NNYx55fyMnc/LdLscYE+B8CYe2wG6v4TRnXDER6QO0V9Wl5axnDKWEA57QeEPPPB9yg4isF5G3RKR9aSsTkSkikigiienp6T7sRu0WEhzEoyPj2Hc8m799bB3zGWPOjy/hIKWMK/4iF5EgYCZwT5krEBkAZKnqhlImj+XM0HgfiFbV3sDHwPzS1qmqs1U1QVUToqKiKt6LOqBvx2aM69+BF7/ayaZ9J9wuxxgTwHwJhzTA+9d7O2Cv13AEEAcsF5GdwEBgSVGjtOPsAABARC4CQlQ1qWicqh5W1aI+IeYAfX2o0Th+P7QHTeuHcv8731FoHfMZY86RL+GwBugmIp1EJAzPF/2SoomqelxVI1U1WlWjgZXAcFVNhOIjixvxtFWcrUQ7hIi08RocDlj/EJXQtEEY9/24J998f4zFibsrXsAYY0pRYTioaj4wFViG54t6saomi8h0ERnuwzYuBdJUtbS7tG6i5BHFNBFJFpF1eK5+mujDNoyXGy5uS/9OzXnyw80czrSO+YwxlSe14br4hIQETUxMdLsMv7LtQAbDnvmCkX3a8pcbL3K7HGOMHxKRJFVNKG2a3SFdS3VrFcEdl3bmraQ0Vu844nY5xpgAY+FQi037YTfaNq3PA+9+Zx3zGWMqxcKhFqsfFswjw3ux9UAm877c4XY5xpgAYuFQy10Z24ofxbbi759sI+2odcxnjPGNhUMd8NDwXgA8vGSjy5UYYwKFhUMd0LZpfX51ZTc+3nSA/1rHfMYYH1g41BG3XdKJHq0ieOT9jWTlWsd8xpjyWTjUEaHBQTw6Ko49x07xzCfb3C7HGOPnLBzqkH7RzbkpoR3zvtjBlv0ZbpdjjPFjFg51zL3DetKoXggPvGsd8xljymbhUMc0bxjGH4b1ZM3Oo7z1TZrb5Rhj/JSFQx00um87Ejo244kPNnH0ZK7b5Rhj/JCFQx0UFCQ8OiqOE9n5/OnDzW6XY4zxQxYOdVRM68ZMvqQTi9bsJmmXdcxnjDmThUMd9ssh3bigST3uf2cDeQXWMZ8x5jQLhzqsYXgIDw3vxeb9Gbz81U63yzHG+BELhzruR7GtGBLTkpkfb2XvsVNul2OM8RMWDnWciPDw8F4UqvLI+8lul2OM8RM+hYOIDBWRLSKSIiL3ljPfaBFREUlwhseLyFqvV6GIxItIxFnjD4nI35xlwkXkDWdbq0Qkuip21JStffMGTBvSjWXJB/h08wG3yzHG+IEKw0FEgoFngWFALDBORGJLmS8CmAasKhqnqgtUNV5V44FbgZ2qulZVM4rGO9N2AW87i00GjqpqV2Am8Kfz20Xji9sv6Uy3lo148L1kTuUWuF2OMcZlvhw59AdSVDVVVXOBRcCIUuabATwFZJexnnHAwrNHikg3oCXwhTNqBDDfef8WMERExIc6zXkICwlixsg40o6e4p+fWcd8xtR1voRDW2C313CaM66YiPQB2qvq0nLWM4ZSwgFPaLyhqkUd/RRvT1XzgeNAi7MXEpEpIpIoIonp6ek+7IapyMDOLbjh4nbM/l8qKQetYz5j6jJfwqG0X+3FPbaJSBCe0z/3lLkCkQFAlqpuKGXyWM4MjXK3VzxCdbaqJqhqQlRUVFmbNpX0hx/H0CAshAfe3cDpvDbG1DW+hEMa0N5ruB2w12s4AogDlovITmAgsKSoUdpxdgAAICIXASGqmlTa9kQkBGgC2C28NaRFo3B+PzSGlalHeOfbPW6XY4xxiS/hsAboJiKdRCQMzxf9kqKJqnpcVSNVNVpVo4GVwHBVTYTiI4sb8bRVnK20doglwATn/WjgU7WfsDVqbL/29OnQlMf+vYnjWXlul2OMcUGF4eCc958KLAM2AYtVNVlEpovIcB+2cSmQpqqppUy7iZLhMA9oISIpwN1AmZfOmuoRFCQ8NvJCjp3K40/LrGM+Y+oiqQ0/yhMSEjQxMdHtMmqdGUs38uJXO3j7rsH06dDM7XKMMVVMRJJUNaG0aXaHtCnTr6/qTqsIT8d8+dYxnzF1ioWDKVOj8BAevC6WjftOMH/FLrfLMcbUIAsHU65hca25vEcUT/93C/uPl3V/ozGmtrFwMOUSEaYPjyO/UJmxdKPb5RhjaoiFg6lQhxYNmHpFV/793T6WbznodjnGmBpg4WB8MuWyznSOasiD7yVz/JTd+2BMbWfhYHwSHhLME6MuZN/xU4yfu5KjJ3PdLskYU40sHIzPBnRuwexbE9h6IJNxc1ZyKDPH7ZKMMdXEwsFUyhUxLXlxQj92Hj7J2NkrOXjCrmAypjaycDCVdkm3SOZP6s++Y6e4adYKe/a0MbWQhYM5JwM6t+CVyQM4nJnLTbNWsPtIltslGWOqkIWDOWd9OzZjwR0DyMjOZ8ysFew4dNLtkowxVcTCwZyX3u2asvCOgWTnFzJm1gp7gpw5L6dyCygsDPzOQGsD65XVVIltBzK4ee4qCguV124fQM82jd0uyQSYl7/awYx/b6JeSBA9WkcQ06YxMa0jiGndmB6tI2hSP9TtEmud8npltXAwVSY1PZOb56wiO7+A1yYPIK5tE7dLMgGgsFB57INNzPtyB5f3iCK6RUM27TvB5v0ZZ9xw2bZpfU9oOMHRs3UEnSIbEhJsJ0DOlYWDqTHfH85i3JyVnMjO45Xb+ttzIEy5svMK+NWitXyYvJ+Jg6P547WxBAd5HiOvquw/kc3m/Rls3pfB5v0n2Lwvg+3pmeQ7p57CgoPo2rIRMW0i6Nm6MTFtIujROoKoRuGIlPY4euPNwsHUqLSjWYyfu4rDmbm8NKkf/aKbu12S8UOHM3O4/ZVE1u4+xgPXxDL5kk4+LZebX8j29MzisNi83xMcB06cvimzRcMwYtqcPiXVs3VjurVqRL3Q4OranYBk4WBq3P7j2dw8dyX7jmUzb0ICg7tGul2S8SOp6ZlMenkN+49n88zYeIbGtTnvdR45mcvm/SfY4nWkseVABtl5ngdVBQl0imxITOvGxaemYlpH0K5Z/Tp7lHHe4SAiQ4FngGBgrqo+WcZ8o4E3gX6qmigi44Hfes3SG7hYVdeKSBjwT+ByoBC4X1X/JSITgT8De5xl/qmqc8urz8LBP6Vn5HDL3FXsPHyS2T9J4LLuUW6XZPxA4s4j3P5KIkEizJ2QwMXVeOqxoFDZdfgkW/ZnsGl/Bpudtozvve7LaRQeUqIto3vrCBrXq/0N4OcVDiISDGwFrgLSgDXAOFXdeNZ8EcC/gTBgqqomnjX9QuA9Ve3sDD8CBKvqAyISBDRX1UNOOCSo6lRfd9DCwX8dOZnLLXNXkXIwk+fGX8yVsa3cLsm4aOn6vdy9eB1tm9bn5Un96NiioSt1ZObks/XAmW0Zm/ef4ER2fvE8bZvWp6f3qak2EUS3qF0N4OWFQ4gPy/cHUlQ11VnZImAEcPaTX2YATwG/KWM944CFXsO3ATEAqloIHPKhFhNgmjcMY+EdA/nJi6v46WtJ/GNcH4ZdeP6nEExgUVVm/S+VJ/+zmYSOzZjzkwSaNQxzrZ5G4SFc3KHZGUctqsq+49ls3n+CTfsyPKen9p/gsy3pFBQ1gIcE0b1VI3q0alwcHDFtIohsFO7WrlQbX8KhLbDbazgNGOA9g4j0Adqr6lIRKSscxuAJFUSkqTNuhohcDmzHc7RxwBl/g4hciueI5dequvvslYnIFGAKQIcOHXzYDeOWJg1Cee32AUx6aQ1TF37L0wWFjIhv63ZZpobkFxTy0JJkFqz6nmt6t+GvN17klw3DIsIFTetzQdP6/DDm9BFuTn4BKQcznbDIYNO+E/xvWzr/+iateJ7IRuHOPRmn2zK6tgzsBnBfwqG0lpric1HOKaGZwMQyVyAyAMhS1Q1e220HfKWqd4vI3cBfgFuB94GFqpojIj8F5gM/LFGA6mxgNnhOK/mwH8ZFEfVCmX9bfybPX8Ov3lhLTn4hNyW0d7ssU81O5uTzi4Xf8unmg9x5WWd+f3UMQUGB1fgbHhJMrwua0OuCM+/bOZyZU6It49WVu8jJ9zSABweJ0wAeQc82jenRKoKYNhG0bRoYDeC+tDkMAh5W1aud4fsAVPUJZ7gJnl/+mc4irYEjwPCidgcRmQmkq+rjzrA480eoaqGItAc+VNVeZ207GDiiquXeTWVtDoHjVG4BU15N5Itth3hsVBzjB3R0uyRTTQ6eyOa2+WvYuPcE00fEccvA2v/fuqBQ2Xn45Om2DOfU1O4jp3sujggPKb4fI6a15/RU91YRRLjQAH6+bQ5rgG4i0gnPFURjgZuLJqrqcaD4OkURWQ78xisYgoAbgUu9llEReR/PlUqfAkNw2jBEpI2q7nNmHQ5s8mkvTUCoHxbMnJ8k8PMF33D/OxvIySvkNh+vbzeBY+uBDCa9tIajWbnMm9CPK2Jaul1SjQgOErpENaJLVCOu6X26bS0jO8/TAO51me173+7ltZzvi+dp37x+ibaM6BYNi28KrGkVhoOq5ovIVGAZnktZX1TVZBGZDiSq6pIKVnEpkFbUoO3l98CrIvI3IB2Y5IyfJiLDgXw8RyATfd4bExDqhQbz/C19mbbwW6Yv3UhuQSE/vayL22WZKvJ1yiHufC2JeqHBLL5zkHWjgue0at+Ozenb8fQNoarKnmOnzmjL2Lw/g8+2HCxuAA8PCaJ7qzPbMmJaR9CiBhrA7SY445q8gkLuXryO99ft5e6rujNtSDe3SzLn6e1v0vj9v9bTKbIhL03qT9um9d0uKeBk53kawDc7bRlbDmSwaV/GGY/ljYo43QB+be8LuKh903LWWLbzPa1kTLUIDQ7ib2PiCQsO4umPtpKbX8g9P+oeEI115kyqyj8+TeHpj7YyuEsLnr+lr/Wieo7qhQYT17ZJiSOu9Iyc4stri9oy5q/YRY/Wjc85HMpj4WBcFRwk/Hl0b8JChH9+lkJOfgF/+HFPC4gAkldQyB/e/o43k9K4/uK2PHl9b8JCas+NYv4iKiKcqIhwLul2uiua/IJCCqrp7I+Fg3FdUJDw+KgLCQ8JZs4XO8jJL+Th63oF3CWPddGJ7Dx+9to3fJlyiF8O6cavruxmwV6DQoKDqu1L3MLB+AUR4aHrYgkLCWL2/1LJzS/k8VEXWkD4sb3HTnHby2tIOZjJn0f35ka7b6VWsXAwfkNEuG9YDOEhQfzj0xRy8wt5anTvWtWXTW2RvPc4t728hqycAl6e1P+MUx2mdrBwMH5FRLjnRz0ICw7irx9tJbegkJlj4gm1gPAby7cc5OcLvqFx/VDevGsQMa3tkbC1kYWD8Uu/GNKN8NAgHv9gM7n5hfzj5j6EhwRuPzW1xcLV3/PAuxvo0SqClyb1o1Xjem6XZKqJ/RwzfmvKpV14+LpY/rvxAD99NYnsvAK3S6qzCguVpz7czH1vf8clXSNZ/NNBFgy1nIWD8WsT/68Tj4+6kOVb07l9fiKnci0galpOfgG/emMtzy3fzrj+HZg3IYFG4XbSobazcDB+7+YBHfjz6Iv4evshJr60mpM5+RUvZKrEsaxcbp23miXr9vK7oT14fFScXSBQR9h/ZRMQRvdtx8wx8STuOspPXlzNiew8t0uq9b4/nMX1z3/N2u+P8czYeH52eVe7h6EOsXAwAWNEfFuevbkP69OOccvcVRzLynW7pFpr3e5jXP/8VxzOzOXVyf3t4Ux1kIWDCShD49rwwi192bwvg5vnrOKwV2dkpmr8N3k/Y2avoH5YMP+6azADOrdwuyTjAgsHE3CG9GzF3AkJbE/PZNyclRzMyHa7pFrj5a92cOdrSfRo3Zi37/o/urZs5HZJxiUWDiYgXdo9ipcn9Sft6CnGzlrJ/uMWEOejsFCZsXQjD7+/kSt7tmLRHQOJiqj+ZwYY/2XhYALWoC4teOW2/hzMyOGmWStIO5rldkkBKXsX9xwAAA/OSURBVDuvgJ8t+IZ5X+5g4uBoXrilL/XD7IbDus7CwQS0hOjmvHb7AI5l5TJm1kp2HT7pdkkB5XBmDuPmrGTZxv08eG0sDw/v5dpjKY1/sXAwAS++fVNev2MgWbn53DRrBdvTM90uKSCkpmdy/fNfs3HvCZ4f39ee5W3O4FM4iMhQEdkiIikicm85840WERWRBGd4vIis9XoViki8My1MRGaLyFYR2SwiNzjjw0XkDWdbq0Qk+vx309R2cW2bsHDKQAoKlTGzVrJlf4bbJfm1xJ1HuP75r8nMzmfRlIEMjWvtdknGz1QYDiISDDwLDANigXEiElvKfBHANGBV0ThVXaCq8aoaD9wK7FTVtc7k+4GDqtrdWe/nzvjJwFFV7QrMBP50rjtn6paY1o1ZNGUQQQJjZ68gee9xt0vyS0vX7+Xmuato1iCMt382mD4dmrldkvFDvhw59AdSVDVVVXOBRcCIUuabATwFlHXZyDhgodfwbcATAKpaqKqHnPEjgPnO+7eAIWK3ZRofdW3ZiMV3DqJ+aDA3z1nFut3H3C7Jb6gqL3y+namvf0vvtk14+67BdGzR0O2yjJ/yJRzaAru9htOcccVEpA/QXlWXlrOeMTjhICJFT8OeISLfiMibItLq7O2paj5wHChxF46ITBGRRBFJTE9P92E3TF0RHdmQN+4cROP6IdwydxVJu464XZLr8gsKeeDdDTz5n81c07sNr90+gGYNw9wuy/gxX8KhtF/txU+0FpEgPKd/7ilzBSIDgCxV3eCMCgHaAV+p6sXACuAvvmyveITqbFVNUNWEqKgoH3bD1CXtmzdg8Z2DiIwI59Z5q1mZetjtklxzMiefKa8msWDV99x5WWf+MbYP9ULtUlVTPl/CIQ3wfjhsO2Cv13AEEAcsF5GdwEBgSVGjtGMsZ55SOgxkAe84w28CF5+9PREJAZoA9tPPVFqbJvV5Y8pA2jatz8SXVvPFtrp3hHnwRDZjZq9g+ZaDPDoyjvuG9bTnchuf+BIOa4BuItJJRMLwfNEvKZqoqsdVNVJVo1U1GlgJDFfVRCg+srgRT1tF0TIKvA9c7owaAmx03i8BJjjvRwOfOvMbU2ktG9dj0ZSBdIpsxOT5iXy6+YDbJdWYrQcyGPXc16Smn2TehH7cMrCj2yWZAFJhODjn/acCy4BNwGJVTRaR6SIy3IdtXAqkqWrqWeN/DzwsIuvxXMlUdFpqHtBCRFKAu4EyL501xhctGoWz8I4B9GgVwZ2vJvHhhv1ul1Ttvt5+iBue/5rcgkIW3zmIK2Jaul2SCTBSG36UJyQkaGJiottlGD93IjuPiS+uZl3acf42Jp7rLrrA7ZKqxdvfpPH7f62nU2RDXprUn7ZN67tdkvFTIpKkqgmlTbM7pE2d0bheKK9MHkDfjs345aJv+VdSmtslVSlV5e+fbOPuxevoF92cN3862ILBnDMLB1OnNAoP4eVJ/RjUpQW/eWsdC1d/73ZJVSKvoJDfvbWepz/ayvUXt+XlSf1pUj/U7bJMALNwMHVOg7AQ5k3ox2Xdo7jv7e94ZcVOt0s6Lyey85j00hreTErjl0O68dcbLyIsxP7XNufH/oJMnVQvNJhZt/blqthWPPheMnO/OPt6icCw99gpbnphBStTD/Pn0b359VXd7TnPpkpYOJg6KzwkmOfGX8w1F7bh0X9v4tnPUtwuqVKS9x5n1HNfsefoKV6e1J8bE9pXvJAxPgpxuwBj3BQaHMQzY+MJCwniz8u2kJNXEBC/vpdvOcjPF3xDk/qhvHnXIGJaN3a7JFPLWDiYOi8kOIi/3HgRYcFB/P3TFHIKCrl3aIzfBsTC1d/zwLsb6NEqgpcm9aNV43pul2RqIQsHY4DgIOGJ6y8kLCSIWZ+nkpNXyEPXxfpVQBQWKn/57xaeW76dy7pH8ez4i2kUbv8Lm+phf1nGOIKChOkjehEWEsS8L3eQW1DIoyPi/KIvopz8An775nqWrNvLuP4dmDGiFyHB1mRoqo+FgzFeRIQHrulJeEgQzy3fTm5+IX+6oberz1U+lpXLlFeTWL3jCL8b2oO7LuviV0c0pnaycDDmLCLCb6/uQXhIMDM/3kpufiFP33SRK7/Udx/JYsJLq0k7copnxsYzIr5txQsZUwUsHIwphYjwyyu7ERoiPPXhFvIKCnlmbJ8avbls3e5jTJ6/hrwC5dXJ/RnQucQzr4ypNhYOxpTjZ5d3JTwkmBlLN5K3IIlnx19MeEj1Pyjnv8n7mbboW6Iiwlk0sT9dWzaq9m0a481atIypwORLOjFjZBwfbzrIHa8kkZ1XUK3be/mrHdz5WhI9Wjfm7bv+z4LBuMLCwRgf3DqwI0/d0JsvtqUz6aU1ZOXmV/k2CguVGUs38vD7G7myZysW3TGQqIjwKt+OMb6wcDDGRzf1a8/Mm+JZteMwE15cTUZ2XpWtOzuvgJ8t+IZ5X+5g4uBoXrilL/XD7DnPxj0WDsZUwsg+bfnHuIv59vtj3DJvNcezzj8gDmfmMG7OSpZt3M+D18by8PBerl46awxYOBhTadf0bsPzt/Rl094T3Dx3JUdP5p7zulLTM7n++a/ZuPcEz4/vy22XdKrCSo05dz6Fg4gMFZEtIpIiImU+01lERouIikiCMzxeRNZ6vQpFJN6ZttxZZ9G0ls74iSKS7jX+9qrYUWOq0lWxrZj9k76kHMxk3JyVpGfkVHodiTuPcP3zX5OZnc+iKQMZGte6Gio15txUGA4iEgw8CwwDYoFxIhJbynwRwDRgVdE4VV2gqvGqGg/cCuxU1bVei40vmq6qB73Gv+E1fu657Zox1evyHi15cWI/dh3OYuzsFRw4ke3zskvX7+Xmuato3iCMt382mD4dmlVjpcZUni9HDv2BFFVNVdVcYBEwopT5ZgBPAWX9HzIOWHhOVRrjp/6vayTzb+vP/uPZ3DRrBXuOnSp3flXlhc+3M/X1b7moXRP+dddgOrZoWEPVGuM7X8KhLbDbazjNGVdMRPoA7VV1aTnrGUPJcHjJOXX0Rzmzs5gbRGS9iLwlIqU+wUREpohIoogkpqen+7AbxlSP/p2a8+rtAzhyMpcxs1aw+0hWqfPlFxTywLsbePI/m7m2dxtenTyAZg3DarhaY3zjSziUdtmEFk8UCQJmAveUuQKRAUCWqm7wGj1eVS8EfuC8bnXGvw9Eq2pv4GNgfmnrVNXZqpqgqglRUVE+7IYx1efiDs14/faBZGTnc9OsFew4dPKM6Sdz8pnyahILVn3PTy/rwt/H9qFeqF2qavyXL+GQBnj/em8H7PUajgDigOUishMYCCwpapR2jOWsowZV3eP8mwG8juf0Fap6WFWLWvfmAH193Rlj3HRhuyYsvGMgufmF3DRrBdsOZABw8EQ2Y2avYPmWgzw6Mo57h8X4RTfgxpTHl3BYA3QTkU4iEobni35J0URVPa6qkaoararRwEpguKomQvGRxY142ipwxoWISKTzPhS4FtjgDLfx2vZwYNN57J8xNSr2gsYsmjIQgLGzV7J0/V5GPfc1qeknmTehH7cM7Ohyhcb4psJwUNV8YCqwDM8X9WJVTRaR6SIy3IdtXAqkqWqq17hwYJmIrAfWAnvwHCUATBORZBFZh+fqp4k+740xfqBbqwgW3zmIsJAgpr7+LbkFhSy+cxBXxLR0uzRjfCaqWvFcfi4hIUETExPdLsOYM+w+ksW8L3dw+w860a5ZA7fLMaYEEUlS1YTSplmX3cZUk/bNG/Dw8F5ul2HMObHuM4wxxpRg4WCMMaYECwdjjDElWDgYY4wpwcLBGGNMCRYOxhhjSrBwMMYYU4KFgzHGmBJqxR3SIpIO7DrHxSOBQ1VYTlWxuirH6qo8f63N6qqc86mro6qW2q11rQiH8yEiiWXdPu4mq6tyrK7K89farK7Kqa667LSSMcaYEiwcjDHGlGDhALPdLqAMVlflWF2V56+1WV2VUy111fk2B2OMMSXZkYMxxpgSLByMMcaUUKvDQUSGisgWEUkRkXtLmT5RRNJFZK3zut1r2gQR2ea8JvhRXQVe45ecvWx11uXMc5OIbHQe5fq613jXPq8K6nLt8xKRmV7b3ioix7ymufn3VV5dbn5eHUTkMxH5VkTWi8iPvabd5yy3RUSu9oe6RCRaRE55fV4v1HBdHUXkE6em5SLSzmva+f99qWqtfAHBwHagMxAGrANiz5pnIvDPUpZtDqQ6/zZz3jdzuy5nWqaLn1c34NuizwJo6SefV6l1uf15nTX/L4AX/eHzKqsutz8vPA2rdznvY4GdXu/X4Xn2fCdnPcF+UFc0sMHFz+tNYILz/ofAq1X591Wbjxz6AymqmqqqucAiYISPy14NfKSqR1T1KPARMNQP6qpOvtR1B/Cs85mgqged8W5/XmXVVZ0q+99xHLDQee/251VWXdXJl7oUaOy8bwLsdd6PABapao6q7gBSnPW5XVd18qWuWOAT5/1nXtOr5O+rNodDW2C313CaM+5sNziHZW+JSPtKLlvTdQHUE5FEEVkpIiOrqCZf6+oOdBeRr5ztD63Esm7UBe5+XoDn8B/PL95PK7tsDdcF7n5eDwO3iEga8AGeoxpfl3WjLoBOzummz0XkB1VUk691rQNucN6PAiJEpIWPy1aoNoeDlDLu7Ot23weiVbU38DEwvxLLulEXQAf13Cp/M/A3EelSg3WF4DmFczmeX5xzRaSpj8u6URe4+3kVGQu8paoF57BsZZ1PXeDu5zUOeFlV2wE/Bl4VkSAfl3Wjrn14Pq8+wN3A6yLSmKrhS12/AS4TkW+By4A9QL6Py1aoNodDGuD9i7sdZx0OquphVc1xBucAfX1d1qW6UNW9zr+pwHKgT03V5czznqrmOYf3W/B8Kbv6eZVTl9ufV5GxnHnqxu3Pq6y63P68JgOLne2vAOrh6VTO7c+r1Lqc01yHnfFJeNoIutdUXaq6V1Wvd8LpfmfccR/3qWLV0ZjiDy88vyZT8Rw2FzXo9DprnjZe70cBK/V0g84OPI05zZz3zf2grmZAuPM+EthGOY2N1VDXUGC+1/Z3Ay384PMqqy5XPy9nvh7ATpwbTv3h76ucutz++/oPMNF53xPPF5oAvTizQTqVqmuQPp+6oorqwNNwvKeG/+4jgSDn/WPA9Kr8+zrvnfDnF55DwK14Ev1+Z9x0YLjz/gkg2fngPwNivJa9DU/DVwowyR/qAgYD3znjvwMm13BdAjwNbHS2P9ZPPq9S63L783KGHwaeLGVZ1z6vsupy+/PC08D6lbP9tcCPvJa931luCzDMH+rCc76/6P/Tb4Drariu0XgCfCswFyfYq+rvy7rPMMYYU0JtbnMwxhhzjiwcjDHGlGDhYIwxpgQLB2OMMSVYOBhjjCnBwsEYY0wJFg7GGGNK+H/5RfDEqTbVCgAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "# plot CV误差曲线\n",
    "test_means = grid_search.cv_results_[ 'mean_test_score' ]\n",
    "test_stds = grid_search.cv_results_[ 'std_test_score' ]\n",
    "train_means = grid_search.cv_results_[ 'mean_train_score' ]\n",
    "train_stds = grid_search.cv_results_[ 'std_train_score' ]\n",
    "\n",
    "x_axis = subsample_s\n",
    "\n",
    "plt.plot(x_axis, -test_means)\n",
    "#plt.errorbar(x_axis, -test_scores[:,i], yerr=test_stds[:,i] ,label = str(max_depths[i]) +' Test')\n",
    "#plt.errorbar(x_axis, -train_scores[:,i], yerr=train_stds[:,i] ,label = str(max_depths[i]) +' Train')\n",
    "\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([-0.47907025, -0.4767751 , -0.4754073 , -0.47663534, -0.47638331])"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "test_means"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### subsample=0.8"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 列采样参数 sub_feature/feature_fraction/colsample_bytree"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "pycharm": {
     "is_executing": false
    },
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Fitting 3 folds for each of 5 candidates, totalling 15 fits\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.\n",
      "[Parallel(n_jobs=4)]: Done  12 out of  15 | elapsed: 24.0min remaining:  6.0min\n",
      "[Parallel(n_jobs=4)]: Done  15 out of  15 | elapsed: 30.2min finished\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "GridSearchCV(cv=StratifiedKFold(n_splits=3, random_state=3, shuffle=True),\n",
       "             error_score='raise-deprecating',\n",
       "             estimator=LGBMClassifier(bagging_freq=1, boosting_type='gbdt',\n",
       "                                      class_weight=None, colsample_bytree=1.0,\n",
       "                                      importance_type='split',\n",
       "                                      learning_rate=0.1, max_bin=127,\n",
       "                                      max_depth=7, min_child_samples=30,\n",
       "                                      min_child_weight=0.001,\n",
       "                                      min_split_gain=0.0, n_estimators=394,\n",
       "                                      n_jobs=4, num_class=9, num_leaves=70,\n",
       "                                      objective='multiclass', random_state=None,\n",
       "                                      reg_alpha=0.0, reg_lambda=0.0,\n",
       "                                      silent=False, subsample=0.8,\n",
       "                                      subsample_for_bin=200000,\n",
       "                                      subsample_freq=0),\n",
       "             iid='warn', n_jobs=4,\n",
       "             param_grid={'colsample_bytree': [0.5, 0.6, 0.7, 0.8, 0.9]},\n",
       "             pre_dispatch='2*n_jobs', refit=False, return_train_score='warn',\n",
       "             scoring='neg_log_loss', verbose=5)"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "params = {'boosting_type': 'gbdt',\n",
    "          'objective': 'multiclass',\n",
    "          'num_class':9, \n",
    "          'n_jobs': 4,\n",
    "          'learning_rate': 0.1,\n",
    "          'n_estimators':n_estimators_1,\n",
    "          'max_depth': 7,\n",
    "          'num_leaves':70,\n",
    "          'min_child_samples':30,\n",
    "          'max_bin': 127, #2^6,原始特征为整数，很少超过100\n",
    "          'subsample': 0.8,\n",
    "          'bagging_freq': 1,\n",
    "          #'colsample_bytree': 0.7,\n",
    "         }\n",
    "lg = LGBMClassifier(silent=False,  **params)\n",
    "\n",
    "colsample_bytree_s = [i/10.0 for i in range(5,10)]\n",
    "tuned_parameters = dict( colsample_bytree = colsample_bytree_s)\n",
    "\n",
    "grid_search = GridSearchCV(lg, n_jobs=4,  param_grid=tuned_parameters, cv = kfold, scoring=\"neg_log_loss\", verbose=5, refit = False,return_train_score='warn')\n",
    "grid_search.fit(X_train , y_train)\n",
    "#grid_search.best_estimator_"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.4739258649144454\n",
      "{'colsample_bytree': 0.5}\n"
     ]
    }
   ],
   "source": [
    "# examine the best model\n",
    "print(-grid_search.best_score_)\n",
    "print(grid_search.best_params_)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "pycharm": {
     "is_executing": false
    },
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYcAAAD4CAYAAAAHHSreAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjAsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+17YcXAAAgAElEQVR4nO3deXwV9b3/8dcnYVXCTgAhEFnDKksUaBUV3K9KVVRwt6301irWpb9rtdpWarVecfd6i9XWKu6tghQrCqJWRQmIsoeQgoQtYQtrCEk+vz/OhHtMAjmBwJwk7+fjkQdnZr4z85khOe8zM2fma+6OiIhItISwCxARkfijcBARkXIUDiIiUo7CQUREylE4iIhIOfXCLqA6tG7d2lNTU8MuQ0SkRpk3b94md29T0bRaEQ6pqalkZGSEXYaISI1iZqsPNE2nlUREpByFg4iIlKNwEBGRchQOIiJSjsJBRETKUTiIiEg5CgcRESlH4SAiUgPtLSrm99OXsm7bniOyfIWDiEgNs2bLbi7938+Z9HE2s5blHpF11Io7pEVE6opZyzZy62tfU+LOH68ezNl92h2R9SgcRERqgKLiEh79IJOnP1xJ7/ZNeeaqQXRudewRW5/CQUQkzuXt2Mv4V77i8+zNjD0phV9f0IdG9ROP6DoVDiIiceyL7M3c/MpXbC/Yx8OXnsDowR2PynoVDiIiccjdmfRxNg+9t5xOLY/hrz86ibR2TY/a+hUOIiJxJn/PPu5442veX7KR8/q14w+X9CepUf2jWoPCQUQkjixam8+Nk+ezbtse7j2/N9d/PxUzO+p1KBxEROKAu/Pa3DXcO3UxrY5twGs/Gcbgzi1Cq0fhICISsj2Fxfzq7UX8bX4Op3RvzWOXD6BVk4ah1qRwEBEJUXbeTn760nwyc3fw8zO6c/OI7iQmHP3TSGXF9PgMMzvHzJabWZaZ3XmQdqPNzM0sPRi+0swWRP2UmNkAM0sqM36TmT0WzNPQzF4L1vWFmaVWx4aKiMSbf3yznguf+pTcHQW8cP1J/PyMHnERDBDDkYOZJQJPA2cCOcBcM5vq7kvKtEsCxgNflI5z98nA5GB6P2CKuy8IJg+Imnce8Pdg8EfAVnfvZmZjgD8Alx/a5omIxJ/CohIeeHcpf/50FQM7NefpKwZxXPPGYZf1HbEcOZwEZLl7trsXAq8CoypoNwF4CCg4wHLGAq+UHWlm3YFk4JNg1CjgheD1m8BIC+NSvYjIEbBu2x4un/Q5f/50Fdd/P5XXxg2Lu2CA2MKhA7AmajgnGLefmQ0EUtx92kGWczkVhAOR0HjN3b3s+ty9CMgHWpWdyczGmVmGmWXk5eXFsBkiIuH6KDOP/3jiEzI37ODpKwbx6wv60KBefD4cO5YL0hV9avf9E80SgEeB6w64ALMhwG53X1TB5DHA1bGub/8I90nAJID09PRy00VE4kVxifPEzBU8MWsFPZKT+J+rBtG1TZOwyzqoWMIhB0iJGu4IrIsaTgL6ArODsz/tgKlmdqG7ZwRtxlDxKaUTgHruPq+C9eWYWT2gGbAlts0REYkvm3fu5eevLeCTFZu4eFAH7v9BPxo3OLIPzasOsYTDXKC7mR0PrCXyRn9F6UR3zwdalw6b2WzgjtJgCI4sLgWGV7Dsiq5DTAWuBT4HRgOzok45iYjUGPNWb+Wml+ezeVchD17cj8tPTAnlbudDUWk4uHuRmd0EvAckAs+7+2Izuw/IcPeplSxiOJDj7tkVTLsMOK/MuOeAF80si8gRw5jKahQRiSfuzp8/XcXvpy/luOaN+ftPv0ffDs3CLqtKrDZ8KE9PT/eMjIzKG4qIHGE7CvbxX3/7hukLN3Bm77Y8fOkJNGt8dB+aFyszm+fu6RVN0x3SIiLVZNmG7fz0pfl8u2U3vzw3jXHDu9SY00hlKRxERKrBm/Ny+NXbC2naqD6v3DCUk45vGXZJh0XhICJyGAr2FfObqYt5de4ahnVpxeNjB5Cc1Cjssg6bwkFE5BCt3ryLGyfPZ/G67fzs9K7cekYP6iXG501tVaVwEBE5BDMWb+D2N74mwYznr0tnRFrbsEuqVgoHEZEq2FdcwsPvLeePH2fTv2Mznr5iECktjwm7rGqncBARidHG7QXc/PJXfLlqC1cP7cyvzu9Fw3rxf7fzoVA4iIjE4LOVmxj/ylfs2lvM42MGMGpAh8pnqsEUDiIiB1FS4jzz0UomzljO8a2P5ZUbhtK9bVLYZR1xCgcRkQPYtruQ217/mlnLcrnghON48OJ+HNuwbrxt1o2tFBGpoq/XbOPGyfPJ3VHAhFF9uGpo5xp7t/OhUDiIiERxd16as5oJ05bSJqkhb/zn9xiQ0jzsso46hYOISGDX3iLuemshUxas4/SebXjksgG0OLZB2GWFQuEgIgJk5e7gP1+aT3beTn5xdk9+empXEhLqzmmkshQOIlLnTVmwll/+fSHHNEjkpR8N4XvdWlc+Uy2ncBA5Qlbm7eTutxbSv2NzRqQlk965Ra157k5tsbeomN9NW8qLc1ZzYmoLnhw7iHbNav5D86qDwkHkCHB3fvm3hXyds415q7cy6eNsmjaqx6k9kzmjVzKn9mhD82Pq5rnseLFmy25+9vJ8vsnJZ9zwLvzi7J7UV3jvp3AQOQLemJfDl6u28ODF/Tj/hOP414o8Zi7N5cPlubzz9ToSDNI7t2REr2RGpiXTLblJnfqaZNhmLdvIra99TUmJ88erB3N2n3ZhlxR31E2oSDXbsquQkRNn07VNE17/ybDvXNQsKXG+WZvPrKUbmbksl8XrtgOQ0rIxI9PacnpaMkO7tKy1z+sJW3GJ8+j7mTz1YRa92zflmasG0bnVsWGXFZqDdROqcBCpZr9442ve+mot08afTFq7pgdtuz5/Dx8uy2PWso38K2sTBftKOKZBIid3a83IXsmcnpZcKzqOiQd5O/Zyy6tf8dnKzYw5MYXfXNiHRvXrdgirD2mRo+SL7M28MS+Hn5zapdJgAGjfrDFXDOnEFUM6UbCvmM9Xbmbmso3MWprLjCUbATihYzNGpLVlZK9k+hzXVKefDsGX/97CTS/PJ3/PPv57dH8uTU8Ju6S4F9ORg5mdAzwOJAJ/cvcHD9BuNPAGcKK7Z5jZlcAvopr0Bwa5+wIzawA8BZwGlAB3u/vfzOw64L+BtcE8T7n7nw5Wn44cJB4UFpVw3hOfsKewmPdvG84xDQ79s5e7s2zDDmYty2Xm0o18tWYb7tC2aUNGpCUzIq0t3+/W6rDWURe4O89+ks0f/rmcTi2P4X+uHESv9pWHdl1xWEcOZpYIPA2cCeQAc81sqrsvKdMuCRgPfFE6zt0nA5OD6f2AKe6+IJh8N5Dr7j3MLAGI7o37NXe/KdYNFIkHz36STVbuTp67Nv2w37TNjF7tm9KrfVN+dno3Nu/cy+zlecxalss7X6/nlS/X0KBeAt/r2oqRaZHTTx1b1L4OZw5H/p59/OKNr5mxZCPn9WvHHy7pT1Kj+mGXVWPE8ht8EpDl7tkAZvYqMApYUqbdBOAh4I4DLGcs8ErU8A+BNAB3LwE2xV62SHz5dvNunpi5gnP6tGNkr+rvLrJVk4ZcMrgjlwzuSGFRCRmrtjAzOKq4Z8pimLKYtHZJjEhLZmSvZAaktCCxDt/du2htPjdOns+6bXu45/ze/PD7qTodV0WxhEMHYE3UcA4wJLqBmQ0EUtx9mpkdKBwuJxIqmFnpU6wmmNlpwErgJnffGIy/xMyGA5nAre6+puzCzGwcMA6gU6dOMWyGyJHh7twzZRH1EoxfX9j7iK+vQb0EvtetNd/r1pp7zu9Ndt7O4PRTLpM+zuZ/Zq+kxTH1Ob1nMiN6JXNK9zY0a1w3PjG7O69nrOGeKYtpeUwDXvvJUAZ3bln5jFJOLOFQUdzuv1ARnBJ6FLjugAswGwLsdvdFUevtCHzq7reZ2W3Aw8DVwDvAK+6+18z+E3gBGFGuAPdJwCSIXHOIYTtEjojpCzfwUWYe95zfm/bNGh/19Xdp04QubZrw41O6kL9nH5+syGNWcE/F379aS70E48TUlozslcyItGS6tGly1Gs8GvYUFnPPlEW8OS+HU7q35rHLB9CqScOwy6qxKr0gbWbDgN+4+9nB8C8B3P2BYLgZkU/+O4NZ2gFbgAvdPSNo8yiQ5+6/D4YtaJ/k7iVmlgL80937lFl3IrDF3ZsdrEZdkJaw7CjYx8iJH9G6SUOm3vT9uHo8RnGJs2DNVmYuzWXWslyWbdgBwPGtj42cfkpLJj21JQ3qxU/Nhyo7byc3Tp7P8o07GD+iO+NHdq/Tp9VidbhfZZ0LdDez44l8g2gMcEXpRHfPB/Y/pcrMZgN3RAVDAnApMDxqHjezd4h8U2kWMJLgGoaZtXf39UHTC4GlMW2lSAgmzsgkb+deJl2THlfBAJCYYAzu3JLBnVvy/85JI2frbj5clsvMZbm8OGc1z/3r3yQ1rMfwHm0YkZbMaT3b1MhP2tMXruf/vfkN9RONv1x/Eqf2aBN2SbVCpeHg7kVmdhPwHpGvsj7v7ovN7D4gw92nVrKI4UBO6QXtKP8FvGhmjwF5wPXB+PFmdiFQROQI5LqYt0bkKPomZxt//XwVVw3pXCM6g+nY4hiuHpbK1cNS2V1YxL9WbGLWsshRxT8WrscMBqY0Z2SvtoxISyatXVJcX8QtLCrhwXeX8fyn/2Zgp+Y8fcUgjmt+9E/r1Va6Q1rkEBSXOD94+lM2bC9g5u2n0rQGf0WypMRZvG57EBQb+TonH4DjmjUKnv3UlmFdW8XV3cTr8/fws8nzmf/tNq7/fiq/PLdXrTg9drTpDmmRavbi56tYuDafJ8YOrNHBAJCQYPTr2Ix+HZtxyxndyd1ewOzlecxctpG/z1/LS3O+pVH9BE7u1poRaZGjijAfa/3JijxueXUBe/cV89QVAzm//3Gh1VKbKRxEqmhDfgEPz8jklO6tuaB/+7DLqXbJTRtx2YkpXHZiCnuLivkiewszgwcFfrA0F4A+xzVlZFoyI3q1pX+HZkelx7TiEufJWSt4fOYKuic34ZmrBtO1ln7zKh7otJJIFf1s8nzeX7qRGT8fTmrruvNET3dnRe7O4NtPG5m3eislDq2bNOD0npGb707u3oYmDav/M+eWXYXc8upXfLJiExcP7MDvLuqrR4dUA51WEqkmHy6PXLy9/cwedSoYIPJIjx5tk+jRNomfntaVrbsK+Sgzj5nLcnlv8QbemJdD/URjaJdWwVdl29Kp1eE/0mP+t1v52eT5bN5VyAMX92PMiSlxfaG8ttCRg0iM9hQWc9ZjH1E/MYF3bzlFfS5E2VdcwrzVW/c/KHBl3i4AuiU3iZx+SktmcBW7SXV3/vLZKu7/x1LaN2/EM1cOpm+Hg97yJFWk/hxEqsFD/1zG/8xeySs3DGVY11ZhlxPXVm3atf9rsl/8ezP7ip2mjepxWnD6qbJuUncU7OPOvy3kHwvXc0avtky89ASaHVOzL/zHI51WEjlMmRt3MOnjbC4e1EHBEIPU1sfyw5OP54cnH8+Ogn38a8UmZi7L5cNluUytpJvUZRu2c+NL81m9ZTd3npvGT4Z30WmkEOjIQaQSJSXOmElzWL5xB7NuP7VG3kUcL0pKnK9ztu1/UOCS9d/tJrV9s0Y8+kEmSY3q89TYgQzpoiA+knTkIHIY3pyfw5ertvDgxf0UDIcpIcEY2KkFAzu14PazerI+f0/k9NPSXF758lv2FpUwtEtLnhg7UN2jhkzhIHIQW3YV8sD0paR3bsFl6lqy2rVv1pgrh3TmyiGd2VNYzMq8naS1S4q751TVRQoHkYN4YPpSdhQUcf9F/Y7KjV51WeMGifo2UhxRPIscwBfZm3ljXg4/PqULPdslhV2OyFGlcBCpQGFRCXe/vYgOzRszfmS3sMsROep0WkmkAs9+kk1W7k6evy5dj2mQOklHDiJlfLt5N0/MXME5fdoxIq1t2OWIhELhIBLF3blnyiLqJRi/vrB32OWIhEbhIBJl+sINfJSZx21n9aR9M/UqJnWXwkEksL1gH799ZzF9jmvKtcM6h12OSKh0pU0k8MiMTPJ27mXSNem6CUvqPP0FiADf5Gzjhc9XcfXQzgxIaR52OSKhUzhInVdc4tz11kJaN2nIHWf3DLsckbgQUziY2TlmttzMsszszoO0G21mbmbpwfCVZrYg6qfEzAYE0xqY2SQzyzSzZWZ2STC+oZm9FqzrCzNLPfzNFDmwv36+ikVrt3Pv+b1p2kh9BohADOFgZonA08C5QG9grJmV+46fmSUB44EvSse5+2R3H+DuA4CrgVXuviCYfDeQ6+49guV+FIz/EbDV3bsBjwJ/ONSNE6nMhvwCJs7I5JTurTm/f/uwyxGJG7EcOZwEZLl7trsXAq8CoypoNwF4CCg4wHLGAq9EDf8QeADA3UvcfVMwfhTwQvD6TWCkqacPOULum7aYwuISfveDvupQRiRKLOHQAVgTNZwTjNvPzAYCKe4+7SDLuZwgHMys9IrfBDObb2ZvmFnpraj71+fuRUA+UK7HDzMbZ2YZZpaRl5cXw2aIfNeHy3KZvnADN5/ejc6tjg27HJG4Eks4VPRxan/3cWaWQOT0z+0HXIDZEGC3uy8KRtUDOgKfuvsg4HPg4VjWt3+E+yR3T3f39DZt2sSwGSL/Z09hMfdMWUTXNscy7tQuYZcjEndiCYccILqXk47AuqjhJKAvMNvMVgFDgamlF6UDY/juKaXNwG7grWD4DWBQ2fWZWT2gGbAlhjpFYvbkrBXkbN3D737Qj4b1EsMuRyTuxBIOc4HuZna8mTUg8kY/tXSiu+e7e2t3T3X3VGAOcKG7Z8D+I4tLiVyrKJ3HgXeA04JRI4ElweupwLXB69HALK8NHV1L3MjcuINJH2dzyaCODOuqPopFKlLpHdLuXmRmNwHvAYnA8+6+2MzuAzLcferBl8BwIMfds8uM/y/gRTN7DMgDrg/GPxeMzyJyxDAm9s0RObiSEufutxbSpFE97jovLexyROJWTI/PcPfpwPQy4+49QNvTygzPJnKqqWy71USCo+z4AiJHGiLV7s15OcxdtZU/XNKPVk0ahl2OSNzSHdJSZ2zZVcjv311KeucWXDo4pfIZROowhYPUGb+fvpSdBUXcf1E/EhJ0T4PIwSgcpE6Yk72ZN+fl8ONTutCzXVLY5YjEPYWD1HqFRSX86u1FdGzRmFtGdg+7HJEaQf05SK337CfZZOXu5Pnr0mncQPc0iMRCRw5Sq63evIsnZq7g3L7tGJHWtvIZRARQOEgt5u7cO2Ux9RKMey8o9yBhETkIhYPUWv9YuJ6PMvO4/ayetG/WOOxyRGoUhYPUStsL9nHfO0voc1xTrhnWOexyRGocXZCWWmnie8vJ27mXZ69Jp16iPgOJVJX+aqTW+SZnG3+ds5qrh3bmhJTmlc8gIuUoHKRWKSou4a63FtK6SUPuOLtn2OWI1FgKB6lVXpyzmkVrt3Pv+b1p2qh+2OWI1FgKB6k1NuQXMHFGJsN7tOH8/u3DLkekRlM4SK1x37TF7CsuYcKoPpjpwXoih0PhILXCh8tymb5wAzeP6EbnVseGXY5IjadwkBpvT2Ex90xZRNc2x3LD8C5hlyNSK+g+B6nxnpi1gpyte3h13FAa1tOD9USqg44cpEbL3LiDZz/O5pJBHRnapVXY5YjUGgoHqbFKSpy731pIk0b1uOu8tLDLEalVFA5SY705L4e5q7byy3PTaNWkYdjliNQqMYWDmZ1jZsvNLMvM7jxIu9Fm5maWHgxfaWYLon5KzGxAMG12sMzSacnB+OvMLC9q/I+rY0Oldtm8cy+/f3cpJ6a24NLBKWGXI1LrVHpB2swSgaeBM4EcYK6ZTXX3JWXaJQHjgS9Kx7n7ZGByML0fMMXdF0TNdqW7Z1Sw2tfc/aaqbozUHQ+8u4ydBUXcf1E/EhJ0T4NIdYvlyOEkIMvds929EHgVGFVBuwnAQ0DBAZYzFnjlkKoUiTInezNvzsvhhuFd6NE2KexyRGqlWMKhA7AmajgnGLefmQ0EUtx92kGWcznlw+HPwamje+y7t7ReYmbfmNmbZlbhOQMzG2dmGWaWkZeXF8NmSG1QWFTC3W8tpGOLxowf0T3sckRqrVjCoaJjdt8/0SwBeBS4/YALMBsC7Hb3RVGjr3T3fsApwc/Vwfh3gFR37w98ALxQ0TLdfZK7p7t7eps2bWLYDKkNJn28kpV5u5gwqi+NG+ieBpEjJZZwyAGiP713BNZFDScBfYHZZrYKGApMLb0oHRhDmaMGd18b/LsDeJnI6SvcfbO77w2aPQsMjnVjpHZbvXkXT87K4ty+7Tg9LTnsckRqtVjCYS7Q3cyON7MGRN7op5ZOdPd8d2/t7qnungrMAS4svdAcHFlcSuRaBcG4embWOnhdHzgfWBQMRz9O80Jg6WFsn9QS7s49UxZTL8H49QV9wi5HpNar9NtK7l5kZjcB7wGJwPPuvtjM7gMy3H3qwZfAcCDH3bOjxjUE3guCIZHI6aNng2njzexCoAjYAlxXlQ2S2ukfC9fzcWYe957fm3bNGoVdjkitZ+5eeas4l56e7hkZFX0jVmqD7QX7GDnxI9o2bcjbN35ffUKLVBMzm+fu6RVN04P3JO5NfG85m3bu5U/XpCsYRI4S/aVJXPt6zTb+Omc11wztzAkpzcMuR6TOUDhI3CoqLuGutxbSpklDbj+7Z9jliNQpCgeJW3/9fDWL123n3gt607RR/bDLEalTFA4SlzbkFzBxxnKG92jDf/RrX/kMIlKtFA4Sl377zmKKSpwJo/rw3SeriMjRoHCQuDNr2UbeXbSBm0d0o3OrY8MuR6ROUjhIXNlTWMy9UxbTLbkJ44Z3DbsckTpL9zlIXHli1gpytu7h1XFDaVBPn11EwqK/Pokbyzfs4NmPsxk9uCNDu7QKuxyROk3hIHGhpMT51dsLadKoHned1yvsckTqPIWDxIU35q1h7qqt3HVuL1oe2yDsckTqPIWDhG7zzr088O4yTkxtwejBHcMuR0RQOEgc+P30ZewsKOL+i/qRkKB7GkTigcJBQvX5ys38bX4ONwzvQo+2SWGXIyIBhYOEZm9RMb96eyEdWzRm/IjuYZcjIlF0n4OE5tmPs1mZt4s/X3cijRskhl2OiETRkYOEYvXmXTw5K4vz+rXj9LTksMsRkTIUDnLUuTu/ensR9RMTuPf8PmGXIyIVUDjIUTftm/V8smITt5/Vg3bNGoVdjohUQOEgR1X+nn3cN20JfTs05ZphqWGXIyIHEFM4mNk5ZrbczLLM7M6DtBttZm5m6cHwlWa2IOqnxMwGBNNmB8ssnZYcjG9oZq8F6/rCzFIPfzMlXkycsZxNO/fy+4v6kah7GkTiVqXhYGaJwNPAuUBvYKyZ9a6gXRIwHviidJy7T3b3Ae4+ALgaWOXuC6Jmu7J0urvnBuN+BGx1927Ao8AfDnHbJM4sWLONF+es5pqhnenfsXnY5YjIQcRy5HASkOXu2e5eCLwKjKqg3QTgIaDgAMsZC7wSw/pGAS8Er98ERpq6AqvxiopLuPuthbRp0pDbz+4ZdjkiUolYwqEDsCZqOCcYt5+ZDQRS3H3aQZZzOeXD4c/BKaV7ogJg//rcvQjIB8o9v9nMxplZhpll5OXlxbAZEqYXPl/N4nXbufeC3jRtVD/sckSkErGEQ0Wf2n3/RLMEIqd/bj/gAsyGALvdfVHU6CvdvR9wSvBzdSzr2z/CfZK7p7t7eps2bSrfCgnN+vw9PDJjOaf2aMN/9GsfdjkiEoNYwiEHSIka7gisixpOAvoCs81sFTAUmFp6UTowhjJHDe6+Nvh3B/AykdNX31mfmdUDmgFbYtsciUe/nbqEohJnwqi+6AyhSM0QSzjMBbqb2fFm1oDIG/3U0onunu/urd091d1TgTnAhe6eAfuPLC4lcq2CYFw9M2sdvK4PnA+UHlVMBa4NXo8GZrl7uSMHqRlmLt3IPxdvYPzI7nRqdUzY5YhIjCp9tpK7F5nZTcB7QCLwvLsvNrP7gAx3n3rwJTAcyHH37KhxDYH3gmBIBD4Ang2mPQe8aGZZRI4YxlRpiyRu7C4s4t4pi+mW3IQbTukSdjkiUgUxPXjP3acD08uMu/cAbU8rMzybyKmm6HG7gMEHmL+AyJGG1HBPzMxi7bY9vDZuKA3q6X5LkZpEf7FyRCzfsIM/fZLN6MEdGdKl3JfNRCTOKRyk2pWUOHe/tZAmjepx13m9wi5HRA6BwkGq3esZa8hYvZW7zu1Fy2MbhF2OiBwChYNUq8079/LAu8s4KbUlowd3DLscETlECgepVvdPX8quvUX87qK+JOjBeiI1lsJBqs1nKzfx9/lrGTe8Cz3aJoVdjogcBoWDVIu9RcX86u1FpLRszM0juoddjogcppjucxCpzKSPssnO28WfrzuRxg0Swy5HRA6TjhzksK3atIsnP8zivH7tOD0tOexyRKQaKBzksLg790xZRIPEBO49v0/Y5YhINVE4yGF555v1fLJiE7ef1YN2zRqFXY6IVBOFgxyy/D37mDBtCf06NOOaYalhlyMi1UgXpOWQPfzecjbv3Mtz16aTqHsaRGoVHTnIIVmwZhsvfbGaa4al0r9j87DLEZFqpnCQKisqLuGuvy+kTZOG3HZWj7DLEZEjQOEgVfbC56tZsn47v76gD00b1Q+7HBE5AhQOUiXr8/fwyIzlnNqjDef1axd2OSJyhCgcpEp+O3UJRSXOhFF9MdNFaJHaSt9Wkpjk7djL0x9m8c/FG/jF2T3p1OqYsEsSkSNI4SAHtXVXIX/8OJsXPlvF3qJixpyYwg2ndAm7LBE5wmIKBzM7B3gcSAT+5O4PHqDdaOAN4ER3zzCzK4FfRDXpDwxy9wVR80wFurh732D4N8ANQF7Q5C53n16lrZLDlr9nH8/96988/69/s6uwiAtPOI5bRnanS5smYZcmIkdBpeFgZonA08CZQA4w18ymuvuSMu2SgPHAF6Xj3H0yMDmY3g+YUiYYLgZ2VrDaR9394apvjhyuXXuL+Mtnq/jjRyvZXlDEuX3b8fMzetCznfpnEKlLYjlyOAnIctWfqVIAAAxsSURBVPdsADN7FRgFLCnTbgLwEHDHAZYzFnildMDMmgC3AeOA16tWtlS3PYXFvDRnNc98tJItuwoZmZbMrWf2oG+HZmGXJiIhiCUcOgBrooZzgCHRDcxsIJDi7tPM7EDhcDmRUCk1AZgI7K6g7U1mdg2QAdzu7ltjqFMOwd6iYl79cg1PfZhF3o69nNK9Nbee2YNBnVqEXZqIhCiWcKjo+4q+f6JZAvAocN0BF2A2BNjt7ouC4QFAN3e/1cxSyzR/hkhwOP8XID+sYJnjiBx10KlTpxg2Q6LtKy7hzXk5PDlzBevyCzgptSVPjR3IkC6twi5NROJALOGQA6REDXcE1kUNJwF9gdnB997bAVPN7EJ3zwjajCHqlBIwDBhsZquCGpLNbLa7n+buG0sbmdmzwLSKinL3ScAkgPT0dK+ojZRXXOK8/dVaHp+5gm+37GZASnP+MLo/J3drrfsWRGS/WMJhLtDdzI4H1hJ5o7+idKK75wOtS4fNbDZwR2kwBEcWlwLDo+Z5hsgRAsGRwzR3Py0Ybu/u64OmFwGLDmnL5DtKSpx/LFzPox9kkp23iz7HNeX569I5vWeyQkFEyqk0HNy9yMxuAt4j8lXW5919sZndB2S4+9RKFjEcyCm9oB2Dh4LTTg6sAn4S43xSAXdnxpKNPPp+Jss27KBH2yb871WDOKt3OxL0mG0ROQBzr/lnZNLT0z0jI6PyhnWIuzM7M49HZmSycG0+x7c+lp+f0Z3z+x+nvhdEBAAzm+fu6RVN0x3StdBnWZt4eMZy5n+7jY4tGvPfo/tz0cAO1EvUo7REJDYKh1pk7qotTJyxnDnZW2jXtBH3X9SXSwen0KCeQkFEqkbhUAt8vWYbE9/P5OPMPFo3acivL+jN2JM60ah+YtiliUgNpXCowZas284j72fywdKNtDimPr88N42rh3XmmAb6bxWRw6N3kRooK3cHj76/gn8sXE9So3rcfmYPrj/5eJo01H+niFQPvZvUIKs27eLxmSuYsmAtjesncvOIbvz45C40O0ZddYpI9VI41AA5W3fz5Mws3pyfQ/1E44ZTuvCTU7vS8tgGYZcmIrWUwiGObdxewFOzsnh17rcYxtVDO3Pj6V1JTmoUdmkiUsspHOLQpp17eWb2Sl6as5riEueyE1O46fRuHNe8cdiliUgdoXCII9t2R7rk/MunkS45Lx7UkfEjuqu/ZhE56hQOcWB7wT6e+yTSJefOwiIu6H8ct5zRna7qklNEQqJwCFFpl5yTPs4mf88+zunTjlvPVJecIhI+hUMICvYFXXLOXsnmXYWMSEvmNnXJKSJxROFwFO0tKua1uWt4alYWuTv2cnK31tx2lrrkFJH4o3A4CvYVl/C3eTk8OSuLtdv2cFJqS54YO5Ch6pJTROKUwuEIKi5xpiyIdMm5evNuTkhpzoOX9FOXnCIS9xQOR0BJiTN90Xoe+2AFWbk76d2+Kc9dm86INHXJKSI1g8KhGrk77y/ZyCNBl5zdk5vwzJWDOLuPuuQUkZpF4VAN3J2PMvN45P1MvsnJJ7XVMTw+ZoC65BSRGkvhcJg+W7mJR2ZkkrF6Kx2aN+ah0f25WF1yikgNp3A4RPNWb2HijEw+W7mZdk0b8bsf9OWydHXJKSK1Q0zhYGbnAI8DicCf3P3BA7QbDbwBnOjuGWZ2JfCLqCb9gUHuviBqnqlAF3fvGwy3BF4DUoFVwGXuvrWK23XELMzJZ+L7y5m9PNIl573n9+aKIeqSU0Rql0rDwcwSgaeBM4EcYK6ZTXX3JWXaJQHjgS9Kx7n7ZGByML0fMKVMMFwM7CyzyjuBme7+oJndGQz/1yFsW7Vaun47j76fyYwlG2l+TH3uPDeNa9Qlp4jUUrG8s50EZLl7NoCZvQqMApaUaTcBeAi44wDLGQu8UjpgZk2A24BxwOtR7UYBpwWvXwBmE2I4ZOXu5LEPMpn2TaRLztvO7MH1308lqZF6XxOR2iuWcOgArIkazgGGRDcws4FAirtPM7MDhcPlRN74S00AJgK7y7Rr6+7rAdx9vZklV7QwMxtHJFjo1KlTDJtRNas3R7rkfPurSJecN53ejRtOUZecIlI3xBIOFX0X0/dPNEsAHgWuO+ACzIYAu919UTA8AOjm7reaWWoV6v2/AtwnAZMA0tPTvZLmMVu7bQ9PzVrBGxk51Es0fnxKF34yvAutmjSsrlWIiMS9WMIhB0iJGu4IrIsaTgL6ArODu3/bAVPN7EJ3zwjajCHqlBIwDBhsZquCGpLNbLa7nwZsNLP2wVFDeyC36ptVdbnbC3j6wyxe+TJykHTV0M7ceFpXkpuqS04RqXtiCYe5QHczOx5YS+SN/orSie6eD7QuHTaz2cAdpcEQHFlcCgyPmucZ4JlgeiowLQgGgKnAtcCDwb9TDmXDYrV5517+96OV/PXzSJecl6ancPMIdckpInVbpeHg7kVmdhPwHpGvsj7v7ovN7D4gw92nVrKI4UBO6QXtGDwIvG5mPwK+JRIsR8Trc9fwm3cWU7CvmIsGduSWkeqSU0QEwNyr7XR9aNLT0z0jI6PyhmV8vnIzL3/5LT9Xl5wiUgeZ2Tx3T69oWp3+kv6wrq0Y1lV9KoiIlKVnPYiISDkKBxERKUfhICIi5SgcRESkHIWDiIiUo3AQEZFyFA4iIlKOwkFERMqpFXdIm1kesPoQZ28NbKrGcqqL6qoa1VV18Vqb6qqaw6mrs7u3qWhCrQiHw2FmGQe6fTxMqqtqVFfVxWttqqtqjlRdOq0kIiLlKBxERKQchUPQm1wcUl1Vo7qqLl5rU11Vc0TqqvPXHEREpDwdOYiISDkKBxERKadWh4OZnWNmy80sy8zurGD6dWaWZ2YLgp8fR0271sxWBD/XxlFdxVHjK+uitVrrCtpcZmZLzGyxmb0cNT60/VVJXaHtLzN7NGrdmWa2LWpamL9fB6srzP3Vycw+NLOvzOwbMzsvatovg/mWm9nZ8VCXmaWa2Z6o/fW/R7muzmY2M6hptpl1jJp2+L9f7l4rf4j0d70S6AI0AL4Gepdpcx3wVAXztgSyg39bBK9bhF1XMG1niPurO/BV6b4AkuNkf1VYV9j7q0z7m4n0vx76/jpQXWHvLyIXVn8avO4NrIp6/TXQEDg+WE5iHNSVCiwKcX+9AVwbvB4BvFidv1+1+cjhJCDL3bPdvRB4FRgV47xnA++7+xZ33wq8D5wTB3UdSbHUdQPwdLBPcPfcYHzY++tAdR1JVf1/HAu8ErwOe38dqK4jKZa6HGgavG4GrAtejwJedfe97v5vICtYXth1HUmx1NUbmBm8/jBqerX8ftXmcOgArIkazgnGlXVJcFj2ppmlVHHeo10XQCMzyzCzOWb2g2qqKda6egA9zOzTYP3nVGHeMOqCcPcXEDn8J/KJd1ZV5z3KdUG4++s3wFVmlgNMJ3JUE+u8YdQFcHxwuukjMzulmmqKta6vgUuC1xcBSWbWKsZ5K1Wbw8EqGFf2e7vvAKnu3h/4AHihCvOGURdAJ4/cKn8F8JiZdT2KddUjcgrnNCKfOP9kZs1jnDeMuiDc/VVqDPCmuxcfwrxVdTh1Qbj7ayzwF3fvCJwHvGhmCTHOG0Zd64nsr4HAbcDLZtaU6hFLXXcAp5rZV8CpwFqgKMZ5K1WbwyEHiP7E3ZEyh4Puvtnd9waDzwKDY503pLpw93XBv9nAbGDg0aoraDPF3fcFh/fLibwph7q/DlJX2Pur1Bi+e+om7P11oLrC3l8/Al4P1v850IjIQ+XC3l8V1hWc5tocjJ9H5BpBj6NVl7uvc/eLg3C6OxiXH+M2Ve5IXEyJhx8inyaziRw2l17Q6VOmTfuo1xcBc/z/Luj8m8jFnBbB65ZxUFcLoGHwujWwgoNcbDwCdZ0DvBC1/jVAqzjYXweqK9T9FbTrCawiuOE0Hn6/DlJX2L9f7wLXBa97EXlDM6AP370gnU31XZA+nLralNZB5MLx2qP8e98aSAhe3w/cV52/X4e9EfH8Q+QQMJNIot8djLsPuDB4/QCwONjxHwJpUfP+kMiFryzg+nioC/gesDAYvxD40VGuy4BHgCXB+sfEyf6qsK6w91cw/BvgwQrmDW1/HaiusPcXkQusnwbrXwCcFTXv3cF8y4Fz46EuIuf7S/9O5wMXHOW6RhMJ8EzgTwTBXl2/X3p8hoiIlFObrzmIiMghUjiIiEg5CgcRESlH4SAiIuUoHEREpByFg4iIlKNwEBGRcv4/pkmI+WHw6nEAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "# plot CV误差曲线\n",
    "test_means = grid_search.cv_results_[ 'mean_test_score' ]\n",
    "test_stds = grid_search.cv_results_[ 'std_test_score' ]\n",
    "train_means = grid_search.cv_results_[ 'mean_train_score' ]\n",
    "train_stds = grid_search.cv_results_[ 'std_train_score' ]\n",
    "\n",
    "x_axis = colsample_bytree_s\n",
    "\n",
    "plt.plot(x_axis, -test_means)\n",
    "#plt.errorbar(x_axis, -test_scores[:,i], yerr=test_stds[:,i] ,label = str(max_depths[i]) +' Test')\n",
    "#plt.errorbar(x_axis, -train_scores[:,i], yerr=train_stds[:,i] ,label = str(max_depths[i]) +' Train')\n",
    "\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "再调小一点，由于特征包括原始特征+tfidf特征，是多了些"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "pycharm": {
     "is_executing": false
    },
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Fitting 3 folds for each of 2 candidates, totalling 6 fits\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.\n",
      "[Parallel(n_jobs=4)]: Done   3 out of   6 | elapsed:  6.1min remaining:  6.1min\n",
      "[Parallel(n_jobs=4)]: Done   6 out of   6 | elapsed: 10.1min finished\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "GridSearchCV(cv=StratifiedKFold(n_splits=3, random_state=3, shuffle=True),\n",
       "             error_score='raise-deprecating',\n",
       "             estimator=LGBMClassifier(bagging_freq=1, boosting_type='gbdt',\n",
       "                                      class_weight=None, colsample_bytree=1.0,\n",
       "                                      importance_type='split',\n",
       "                                      learning_rate=0.1, max_bin=127,\n",
       "                                      max_depth=7, min_child_samples=30,\n",
       "                                      min_child_weight=0.001,\n",
       "                                      min_split_gain=0.0, n_estimators=394,\n",
       "                                      n_jobs=4, num_class=9, num_leaves=70,\n",
       "                                      objective='multiclass', random_state=None,\n",
       "                                      reg_alpha=0.0, reg_lambda=0.0,\n",
       "                                      silent=False, subsample=0.8,\n",
       "                                      subsample_for_bin=200000,\n",
       "                                      subsample_freq=0),\n",
       "             iid='warn', n_jobs=4, param_grid={'colsample_bytree': [0.3, 0.4]},\n",
       "             pre_dispatch='2*n_jobs', refit=False, return_train_score='warn',\n",
       "             scoring='neg_log_loss', verbose=5)"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "params = {'boosting_type': 'gbdt',\n",
    "          'objective': 'multiclass',\n",
    "          'num_class':9, \n",
    "          'n_jobs': 4,\n",
    "          'learning_rate': 0.1,\n",
    "          'n_estimators':n_estimators_1,\n",
    "          'max_depth': 7,\n",
    "          'num_leaves':70,\n",
    "          'min_child_samples':30,\n",
    "          'max_bin': 127, #2^6,原始特征为整数，很少超过100\n",
    "          'subsample': 0.8,\n",
    "          'bagging_freq': 1,\n",
    "          #'colsample_bytree': 0.7,\n",
    "         }\n",
    "lg = LGBMClassifier(silent=False,  **params)\n",
    "\n",
    "colsample_bytree_s = [i/10.0 for i in range(3,5)]\n",
    "tuned_parameters = dict( colsample_bytree = colsample_bytree_s)\n",
    "\n",
    "grid_search = GridSearchCV(lg, n_jobs=4,  param_grid=tuned_parameters, cv = kfold, scoring=\"neg_log_loss\", verbose=5, refit = False,return_train_score='warn')\n",
    "grid_search.fit(X_train , y_train)\n",
    "#grid_search.best_estimator_"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "pycharm": {
     "is_executing": false
    },
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.4731996299261459\n",
      "{'colsample_bytree': 0.3}\n"
     ]
    }
   ],
   "source": [
    "# examine the best model\n",
    "print(-grid_search.best_score_)\n",
    "print(grid_search.best_params_)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### colsample_bytree=0.4"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 正则化参数lambda_l1(reg_alpha), lambda_l2(reg_lambda)感觉不用调了"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 减小学习率，调整n_estimators"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "best n_estimators: 3648\n",
      "best cv score: 0.4645050558729036\n"
     ]
    }
   ],
   "source": [
    "params = {'boosting_type': 'gbdt',\n",
    "          'objective': 'multiclass',\n",
    "          'num_class':9, \n",
    "          'n_jobs': 4,\n",
    "          'learning_rate': 0.01,\n",
    "          #'n_estimators':n_estimators_1,\n",
    "          'max_depth': 7,\n",
    "          'num_leaves':70,\n",
    "          'min_child_samples':30,\n",
    "          'max_bin': 127, #2^6,原始特征为整数，很少超过100\n",
    "          'subsample': 0.8,\n",
    "          'bagging_freq': 1,\n",
    "          'colsample_bytree': 0.4,\n",
    "         }\n",
    "n_estimators_2 = get_n_estimators(params , X_train , y_train)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 用所有训练数据，采用最佳参数重新训练模型\n",
    "由于样本数目增多，模型复杂度稍微扩大一点？\n",
    "num_leaves增多5\n",
    "min_child_samples按样本比例增加到40"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "LGBMClassifier(bagging_freq=1, boosting_type='gbdt', class_weight=None,\n",
       "               colsample_bytree=0.4, importance_type='split',\n",
       "               learning_rate=0.01, max_bin=127, max_depth=7,\n",
       "               min_child_samples=40, min_child_weight=0.001, min_split_gain=0.0,\n",
       "               n_estimators=3648, n_jobs=4, num_class=9, num_leaves=75,\n",
       "               objective='multiclass', random_state=None, reg_alpha=0.0,\n",
       "               reg_lambda=0.0, silent=False, subsample=0.8,\n",
       "               subsample_for_bin=200000, subsample_freq=0)"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "params = {'boosting_type': 'gbdt',\n",
    "          'objective': 'multiclass',\n",
    "          'num_class':9, \n",
    "          'n_jobs': 4,\n",
    "          'learning_rate': 0.01,\n",
    "          'n_estimators':n_estimators_2,\n",
    "          'max_depth': 7,\n",
    "          'num_leaves':75,\n",
    "          'min_child_samples':40,\n",
    "          'max_bin': 127, #2^6,原始特征为整数，很少超过100\n",
    "          'subsample': 0.8,\n",
    "          'bagging_freq': 1,\n",
    "          'colsample_bytree': 0.4,\n",
    "         }\n",
    "\n",
    "lg = LGBMClassifier(silent=False,  **params)\n",
    "lg.fit(X_train, y_train)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 保存模型，用于后续测试"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [],
   "source": [
    "import _pickle as cPickle\n",
    "\n",
    "cPickle.dump(lg, open('./data/'+\"Otto_LightGBM_org_tfidf.pkl\", 'wb'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 特征重要性"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [],
   "source": [
    "df = pd.DataFrame({\"columns\":list(feat_names), \"importance\":list(lg.feature_importances_.T)})\n",
    "df = df.sort_values(by=['importance'],ascending=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>columns</th>\n",
       "      <th>importance</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>159</th>\n",
       "      <td>feat_67_tfidf</td>\n",
       "      <td>37470</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>117</th>\n",
       "      <td>feat_25_tfidf</td>\n",
       "      <td>36468</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>116</th>\n",
       "      <td>feat_24_tfidf</td>\n",
       "      <td>33995</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>140</th>\n",
       "      <td>feat_48_tfidf</td>\n",
       "      <td>33585</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>132</th>\n",
       "      <td>feat_40_tfidf</td>\n",
       "      <td>31993</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>178</th>\n",
       "      <td>feat_86_tfidf</td>\n",
       "      <td>30079</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>106</th>\n",
       "      <td>feat_14_tfidf</td>\n",
       "      <td>28838</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>154</th>\n",
       "      <td>feat_62_tfidf</td>\n",
       "      <td>23267</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>107</th>\n",
       "      <td>feat_15_tfidf</td>\n",
       "      <td>21270</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>108</th>\n",
       "      <td>feat_16_tfidf</td>\n",
       "      <td>21248</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>66</th>\n",
       "      <td>feat_67</td>\n",
       "      <td>20992</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>125</th>\n",
       "      <td>feat_33_tfidf</td>\n",
       "      <td>20535</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>156</th>\n",
       "      <td>feat_64_tfidf</td>\n",
       "      <td>20391</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>134</th>\n",
       "      <td>feat_42_tfidf</td>\n",
       "      <td>19616</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>180</th>\n",
       "      <td>feat_88_tfidf</td>\n",
       "      <td>19378</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>126</th>\n",
       "      <td>feat_34_tfidf</td>\n",
       "      <td>19279</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>146</th>\n",
       "      <td>feat_54_tfidf</td>\n",
       "      <td>17998</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>164</th>\n",
       "      <td>feat_72_tfidf</td>\n",
       "      <td>17577</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>feat_24</td>\n",
       "      <td>17120</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>100</th>\n",
       "      <td>feat_8_tfidf</td>\n",
       "      <td>16650</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>152</th>\n",
       "      <td>feat_60_tfidf</td>\n",
       "      <td>16251</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>47</th>\n",
       "      <td>feat_48</td>\n",
       "      <td>16168</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>124</th>\n",
       "      <td>feat_32_tfidf</td>\n",
       "      <td>15916</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>feat_25</td>\n",
       "      <td>15544</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>162</th>\n",
       "      <td>feat_70_tfidf</td>\n",
       "      <td>15201</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>39</th>\n",
       "      <td>feat_40</td>\n",
       "      <td>14690</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>128</th>\n",
       "      <td>feat_36_tfidf</td>\n",
       "      <td>14544</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>feat_14</td>\n",
       "      <td>14155</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>167</th>\n",
       "      <td>feat_75_tfidf</td>\n",
       "      <td>13971</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>85</th>\n",
       "      <td>feat_86</td>\n",
       "      <td>13873</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>153</th>\n",
       "      <td>feat_61_tfidf</td>\n",
       "      <td>2037</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>64</th>\n",
       "      <td>feat_65</td>\n",
       "      <td>2006</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>feat_30</td>\n",
       "      <td>1958</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>feat_3</td>\n",
       "      <td>1909</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28</th>\n",
       "      <td>feat_29</td>\n",
       "      <td>1862</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>176</th>\n",
       "      <td>feat_84_tfidf</td>\n",
       "      <td>1805</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>feat_21</td>\n",
       "      <td>1716</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>98</th>\n",
       "      <td>feat_6_tfidf</td>\n",
       "      <td>1711</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>feat_19</td>\n",
       "      <td>1631</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>45</th>\n",
       "      <td>feat_46</td>\n",
       "      <td>1576</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>44</th>\n",
       "      <td>feat_45</td>\n",
       "      <td>1537</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48</th>\n",
       "      <td>feat_49</td>\n",
       "      <td>1520</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>51</th>\n",
       "      <td>feat_52</td>\n",
       "      <td>1500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>feat_23</td>\n",
       "      <td>1491</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>76</th>\n",
       "      <td>feat_77</td>\n",
       "      <td>1473</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>62</th>\n",
       "      <td>feat_63</td>\n",
       "      <td>1429</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>feat_12</td>\n",
       "      <td>1392</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>feat_7</td>\n",
       "      <td>1363</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>feat_2</td>\n",
       "      <td>1233</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>143</th>\n",
       "      <td>feat_51_tfidf</td>\n",
       "      <td>1194</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>92</th>\n",
       "      <td>feat_93</td>\n",
       "      <td>1171</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>30</th>\n",
       "      <td>feat_31</td>\n",
       "      <td>1085</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>feat_28</td>\n",
       "      <td>1061</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>feat_5</td>\n",
       "      <td>998</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>80</th>\n",
       "      <td>feat_81</td>\n",
       "      <td>890</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>60</th>\n",
       "      <td>feat_61</td>\n",
       "      <td>695</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>81</th>\n",
       "      <td>feat_82</td>\n",
       "      <td>594</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>feat_6</td>\n",
       "      <td>441</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>83</th>\n",
       "      <td>feat_84</td>\n",
       "      <td>439</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>50</th>\n",
       "      <td>feat_51</td>\n",
       "      <td>277</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>186 rows × 2 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "           columns  importance\n",
       "159  feat_67_tfidf       37470\n",
       "117  feat_25_tfidf       36468\n",
       "116  feat_24_tfidf       33995\n",
       "140  feat_48_tfidf       33585\n",
       "132  feat_40_tfidf       31993\n",
       "178  feat_86_tfidf       30079\n",
       "106  feat_14_tfidf       28838\n",
       "154  feat_62_tfidf       23267\n",
       "107  feat_15_tfidf       21270\n",
       "108  feat_16_tfidf       21248\n",
       "66         feat_67       20992\n",
       "125  feat_33_tfidf       20535\n",
       "156  feat_64_tfidf       20391\n",
       "134  feat_42_tfidf       19616\n",
       "180  feat_88_tfidf       19378\n",
       "126  feat_34_tfidf       19279\n",
       "146  feat_54_tfidf       17998\n",
       "164  feat_72_tfidf       17577\n",
       "23         feat_24       17120\n",
       "100   feat_8_tfidf       16650\n",
       "152  feat_60_tfidf       16251\n",
       "47         feat_48       16168\n",
       "124  feat_32_tfidf       15916\n",
       "24         feat_25       15544\n",
       "162  feat_70_tfidf       15201\n",
       "39         feat_40       14690\n",
       "128  feat_36_tfidf       14544\n",
       "13         feat_14       14155\n",
       "167  feat_75_tfidf       13971\n",
       "85         feat_86       13873\n",
       "..             ...         ...\n",
       "153  feat_61_tfidf        2037\n",
       "64         feat_65        2006\n",
       "29         feat_30        1958\n",
       "2           feat_3        1909\n",
       "28         feat_29        1862\n",
       "176  feat_84_tfidf        1805\n",
       "20         feat_21        1716\n",
       "98    feat_6_tfidf        1711\n",
       "18         feat_19        1631\n",
       "45         feat_46        1576\n",
       "44         feat_45        1537\n",
       "48         feat_49        1520\n",
       "51         feat_52        1500\n",
       "22         feat_23        1491\n",
       "76         feat_77        1473\n",
       "62         feat_63        1429\n",
       "11         feat_12        1392\n",
       "6           feat_7        1363\n",
       "1           feat_2        1233\n",
       "143  feat_51_tfidf        1194\n",
       "92         feat_93        1171\n",
       "30         feat_31        1085\n",
       "27         feat_28        1061\n",
       "4           feat_5         998\n",
       "80         feat_81         890\n",
       "60         feat_61         695\n",
       "81         feat_82         594\n",
       "5           feat_6         441\n",
       "83         feat_84         439\n",
       "50         feat_51         277\n",
       "\n",
       "[186 rows x 2 columns]"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "pycharm": {
     "is_executing": false
    },
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYMAAAD4CAYAAAAO9oqkAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjAsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+17YcXAAAXj0lEQVR4nO3df4zcd53f8efrnB9HDzg7ZEGW7Z4Dda8EpHOCm1iih2iAxAntObRclagi1jWV7zhHAvVaYYrUcECk0AqQIkFOoXFxThST8kOxLqbGyuWKkMiPDTg/TEi9mFxj4sYGhxBEG5rw7h/zWZg4s7uz692dmd3nQxrNzHu+39n3fGfm+5rv5/ud2VQVkqTl7TcG3YAkafAMA0mSYSBJMgwkSRgGkiTgjEE3MFfnnnturV+/ftBtSNJIeeCBB35UVWOn1kc2DNavX8/4+Pig25CkkZLkb3vVHSaSJBkGkiTDQJKEYSBJwjCQJGEYSJIwDCRJGAaSJAwDSRKGgaRlbv3OO1m/885BtzFwhoEkyTCQJBkGkiQMA0kShoG07LizVL0YBpIkw0CSZBhIkjAMJEn0EQZJfjPJfUkeTHIoyZ+3+ueS/CDJwXba2OpJclOSiSQPJbmw6762JTncTtu66m9K8nCb56YkWYgHK0nq7Yw+pnkOuKSqfpbkTOCbSb7Wbvt3VfWlU6a/HNjQThcDNwMXJzkHuB7YBBTwQJK9VfV0m2Y7cA+wD9gCfA1J0qKYccugOn7Wrp7ZTjXNLFuB29p89wArk6wGLgMOVNXJFgAHgC3ttldW1beqqoDbgCtP4zFJkmapr30GSVYkOQgcp7NCv7fddEMbCvpUkrNbbQ3wRNfsR1ttuvrRHvVefWxPMp5k/MSJE/20LknqQ19hUFUvVNVGYC1wUZI3Ah8E/gHwD4FzgA+0yXuN99cc6r36uKWqNlXVprGxsX5alyT1YVZHE1XVT4C/AbZU1bE2FPQc8F+Ai9pkR4F1XbOtBZ6cob62R12StEj6OZpoLMnKdvllwNuB77WxftqRP1cCj7RZ9gLXtKOKNgPPVNUxYD9waZJVSVYBlwL7223PJtnc7usa4I75fZiSpOn0czTRamB3khV0wuP2qvqrJH+dZIzOMM9B4E/a9PuAK4AJ4OfAHwFU1ckkHwXub9N9pKpOtsvvBT4HvIzOUUQeSSRJi2jGMKiqh4ALetQvmWL6AnZMcdsuYFeP+jjwxpl6kSQtDL+BLC0T/lqppmMYSJIMA0mD5z+lHzzDQJJkGEiSDANJEoaBJAnDQJKEYSBJwjCQJGEYSJIwDCTNM788NpoMA0mSYSBJMgwkSRgGkiQMA0kShoEkCcNAkkQfYZDkN5Pcl+TBJIeS/Hmrn5fk3iSHk3wxyVmtfna7PtFuX991Xx9s9ceSXNZV39JqE0l2zv/DlCRNp58tg+eAS6rq94CNwJYkm4GPA5+qqg3A08C1bfprgaer6u8Bn2rTkeR84CrgDcAW4DNJViRZAXwauBw4H7i6TStJWiQzhkF1/KxdPbOdCrgE+FKr7waubJe3tuu029+WJK2+p6qeq6ofABPARe00UVVHquoXwJ42rSRpkfS1z6B9gj8IHAcOAN8HflJVz7dJjgJr2uU1wBMA7fZngFd110+ZZ6p6rz62JxlPMn7ixIl+WpekoTYsP9/RVxhU1QtVtRFYS+eT/Ot7TdbOM8Vts6336uOWqtpUVZvGxsZmblyS1JdZHU1UVT8B/gbYDKxMcka7aS3wZLt8FFgH0G7/beBkd/2UeaaqS5IWST9HE40lWdkuvwx4O/AocDfw7jbZNuCOdnlvu067/a+rqlr9qna00XnABuA+4H5gQzs66Sw6O5n3zseDkyT154yZJ2E1sLsd9fMbwO1V9VdJvgvsSfIx4DvArW36W4G/TDJBZ4vgKoCqOpTkduC7wPPAjqp6ASDJdcB+YAWwq6oOzdsjlCTNaMYwqKqHgAt61I/Q2X9wav3/An84xX3dANzQo74P2NdHv5KkBeA3kKURMSxHnWhpMgwkSYaBJMkwkCRhGEiSMAwkSRgGkiQMA0kShoEkCcNAkoRhIEnCMJAkYRhIkjAMJEkYBpIkDANJEoaBJAnDQJJEH2GQZF2Su5M8muRQkve1+oeT/DDJwXa6omueDyaZSPJYksu66ltabSLJzq76eUnuTXI4yReTnDXfD1SSNLV+tgyeB/6sql4PbAZ2JDm/3fapqtrYTvsA2m1XAW8AtgCfSbIiyQrg08DlwPnA1V338/F2XxuAp4Fr5+nxSZL6MGMYVNWxqvp2u/ws8CiwZppZtgJ7quq5qvoBMAFc1E4TVXWkqn4B7AG2JglwCfClNv9u4Mq5PiBJ0uzNap9BkvXABcC9rXRdkoeS7EqyqtXWAE90zXa01aaqvwr4SVU9f0pdkrRI+g6DJC8Hvgy8v6p+CtwMvA7YCBwDPjE5aY/Zaw71Xj1sTzKeZPzEiRP9ti5JmkFfYZDkTDpB8Pmq+gpAVT1VVS9U1S+Bz9IZBoLOJ/t1XbOvBZ6cpv4jYGWSM06pv0RV3VJVm6pq09jYWD+tS5L60M/RRAFuBR6tqk921Vd3TfYu4JF2eS9wVZKzk5wHbADuA+4HNrQjh86is5N5b1UVcDfw7jb/NuCO03tYkqTZOGPmSXgz8B7g4SQHW+3f0zkaaCOdIZ3HgT8GqKpDSW4HvkvnSKQdVfUCQJLrgP3ACmBXVR1q9/cBYE+SjwHfoRM+kqRFMmMYVNU36T2uv2+aeW4AbuhR39drvqo6wq+HmSRJi8xvIEuSDANJkmEgScIwkCRhGEha4tbvvHPQLYwEw0CSZBhIkgwDSRKGgSQJw0CShGEgScIwkCRhGEiSMAwkSRgGkiT6++c2koZE908rPH7jOwfYiZYatwwkSYaBhp8/NCYtPMNAkjRzGCRZl+TuJI8mOZTkfa1+TpIDSQ6381WtniQ3JZlI8lCSC7vua1ub/nCSbV31NyV5uM1zU5Je/3NZ0oC4dbb09bNl8DzwZ1X1emAzsCPJ+cBO4K6q2gDc1a4DXA5saKftwM3QCQ/geuBi4CLg+skAadNs75pvy+k/NEl6sfU77zTYpjBjGFTVsar6drv8LPAosAbYCuxuk+0GrmyXtwK3Vcc9wMokq4HLgANVdbKqngYOAFvaba+sqm9VVQG3dd2XJGkRzGqfQZL1wAXAvcBrquoYdAIDeHWbbA3wRNdsR1ttuvrRHvVef397kvEk4ydOnJhN65KkafQdBkleDnwZeH9V/XS6SXvUag71lxarbqmqTVW1aWxsbKaWJUl96isMkpxJJwg+X1VfaeWn2hAP7fx4qx8F1nXNvhZ4cob62h51SSPOMfrR0c/RRAFuBR6tqk923bQXmDwiaBtwR1f9mnZU0WbgmTaMtB+4NMmqtuP4UmB/u+3ZJJvb37qm674kSYugny2DNwPvAS5JcrCdrgBuBN6R5DDwjnYdYB9wBJgAPgv8KUBVnQQ+CtzfTh9pNYD3Av+5zfN94Gvz8NgkzSM/5S+8QS7fGX+bqKq+Se9xfYC39Zi+gB1T3NcuYFeP+jjwxpl6kSQtDL+BLEkyDCRJhoEkCcNAkoRhIEnCMJA0Qjy0deEYBpIkw0CSZBhI0oyWw/CUYSBJMgyGyXL49CFpOBkGkiTDQJJkGEgD5/CghoFhIEkyDNQfP71qlPh6nT3DQJJkGEiamp+wlw/DQJI0cxgk2ZXkeJJHumofTvLDJAfb6Yqu2z6YZCLJY0ku66pvabWJJDu76ucluTfJ4SRfTHLWfD7AUeM/HZc0CP1sGXwO2NKj/qmq2thO+wCSnA9cBbyhzfOZJCuSrAA+DVwOnA9c3aYF+Hi7rw3A08C1p/OAJEmzN2MYVNU3gJN93t9WYE9VPVdVPwAmgIvaaaKqjlTVL4A9wNYkAS4BvtTm3w1cOcvHIEnzajluoZ/OPoPrkjzUhpFWtdoa4ImuaY622lT1VwE/qarnT6n3lGR7kvEk4ydOnDiN1iVJ3eYaBjcDrwM2AseAT7R6ekxbc6j3VFW3VNWmqto0NjY2u44lSVOaUxhU1VNV9UJV/RL4LJ1hIOh8sl/XNela4Mlp6j8CViY545S6FtBy2/zVwvB1tLTMKQySrO66+i5g8kijvcBVSc5Och6wAbgPuB/Y0I4cOovOTua9VVXA3cC72/zbgDvm0pMkae7OmGmCJF8A3gqcm+QocD3w1iQb6QzpPA78MUBVHUpyO/Bd4HlgR1W90O7nOmA/sALYVVWH2p/4ALAnyceA7wC3ztujkyT1ZcYwqKqre5SnXGFX1Q3ADT3q+4B9PepH+PUw08ia3GR+/MZ3DrgTSZo9v4EsTcNxcfWyFA89NQwkLbiltuJcigwDaQgsxU+aGi2GgSTJMJCkqQzL1tpibDkaBpIkw0CSBmFYtjomGQbSEjZsKxwNL8NAkmQYSKPOT/+aD4aBNIQWegXv9xp0qhl/m0iSBs3gWniGgTSiXEGePpfhrzlMJEkyDDR4fjrTMFjur0PDQJJkGEiSDANJGrhhGKKaMQyS7EpyPMkjXbVzkhxIcridr2r1JLkpyUSSh5Jc2DXPtjb94STbuupvSvJwm+emJJnvBylJC2EYVuLzpZ8tg88BW06p7QTuqqoNwF3tOsDlwIZ22g7cDJ3wAK4HLqbz/46vnwyQNs32rvlO/VuSpAU2YxhU1TeAk6eUtwK72+XdwJVd9duq4x5gZZLVwGXAgao6WVVPAweALe22V1bVt6qqgNu67mskLaVPCoPg8hstfpN56ZjrPoPXVNUxgHb+6lZfAzzRNd3RVpuufrRHXdIImQyE7mAwJEbLfO9A7jXeX3Oo977zZHuS8STjJ06cmGOLS5dvPg07X6PDa65h8FQb4qGdH2/1o8C6runWAk/OUF/bo95TVd1SVZuqatPY2NgcW5cknWquYbAXmDwiaBtwR1f9mnZU0WbgmTaMtB+4NMmqtuP4UmB/u+3ZJJvbUUTXdN3XyHIcVRoM33tzN+MP1SX5AvBW4NwkR+kcFXQjcHuSa4H/Bfxhm3wfcAUwAfwc+COAqjqZ5KPA/W26j1TV5E7p99I5YullwNfaSZK0iGYMg6q6eoqb3tZj2gJ2THE/u4BdPerjwBtn6kO9rd95J4/f+M5Bt6Euk59MB/G8DPJva7T5DWQNhJvymi++luaHYSBp6LiCX3yGgaRZcUW9NBkGUp8GdaSKK9+XGtTzsJSfC8OgT0v1RbBUH5ek2TEMNK8MF2k0GQaStEj6+bA0qOEow0CSZmGpbv3O+KUzDb+l+uKUpuPrfn65ZSAtAldcGnaGgYDRXlkt9UP+ljqfu+FgGMzAFc1o8jmbPZdZx3wth1FbnoaBRtKovdGk2Vrs17hhIElDbLFCwTCQtOy4ZflShsES5b6O0eRzNrpG/T1nGEjSPBrVQDAMJL3IqK7MRrXvYWEYSAvMlZRGwWmFQZLHkzyc5GCS8VY7J8mBJIfb+apWT5KbkkwkeSjJhV33s61NfzjJttN7SJJGlcE5OPOxZfCPq2pjVW1q13cCd1XVBuCudh3gcmBDO20HboZOeADXAxcDFwHXTwbIsBr1HUXqz3J4jgf5GJfD8h0lCzFMtBXY3S7vBq7sqt9WHfcAK5OsBi4DDlTVyap6GjgAbFmAvkaKb5TR4vOlUXe6YVDA15M8kGR7q72mqo4BtPNXt/oa4ImueY+22lT1l0iyPcl4kvETJ06cZuvScDudLVDDSbN1umHw5qq6kM4Q0I4kb5lm2vSo1TT1lxarbqmqTVW1aWxsbPbdNr5RRkM/K8PTvV1Sx2mFQVU92c6PA1+lM+b/VBv+oZ0fb5MfBdZ1zb4WeHKauvQirtilhTPnMEjyW0leMXkZuBR4BNgLTB4RtA24o13eC1zTjiraDDzThpH2A5cmWdV2HF/aakveUtoRvVQex3K02K/D2f49X1uL43S2DF4DfDPJg8B9wJ1V9d+BG4F3JDkMvKNdB9gHHAEmgM8CfwpQVSeBjwL3t9NHWk0jzjfx9Fw+i8vlPb05/9vLqjoC/F6P+o+Bt/WoF7BjivvaBeyaay+DNtWLbP3OO3n8xnf+6nyQJnscdB9afK4EF95SWMZ+A3kJWgovzKVqIZ8bn3edDsNA6oMrWi11hoEW1emuVBdyZ+dS2qGv4TDfr6mFfH0aBkvcdPszXPFJmmQY6Ff6DYjZhshih85i/T2/8KalxDBYZK4gBs/nQHqpOR9aquVhLivOpb6y7T5UeKk+1qX6uDQ1twy0ZLgCk+bOMGj6XZG4wpmbYdhhPei/r+VrFF57hsGQG4UX0SgahnCSholhIElyB/KofTpcLr8xNGrPi4aTr6P+LfswGBSP0lm+fB6Xt2F9/h0mWiSjNEY92WevfkflMUiaHbcMptHPN0znY7hmGH7ieq7mKxz8NU9psJb1loErCc03X1Pzx2W5uJZ1GJxqWIdyHK6RtNAMAy1rhqrU4T4DDYwrYml4DM2WQZItSR5LMpFk5yB7cSW1/HQ/5z7/Wo6GIgySrAA+DVwOnA9cneT8wXalmbjSlJaOoQgD4CJgoqqOVNUvgD3A1gH3JEnLRqpq0D2Q5N3Alqr61+36e4CLq+q6U6bbDmxvV38XeOw0/uy5wI9OY/7FMOw9Dnt/YI/zxR7nxzD0+DtVNXZqcVh2IKdH7SUpVVW3ALfMyx9Mxqtq03zc10IZ9h6HvT+wx/lij/NjmHsclmGio8C6rutrgScH1IskLTvDEgb3AxuSnJfkLOAqYO+Ae5KkZWMohomq6vkk1wH7gRXArqo6tMB/dl6GmxbYsPc47P2BPc4Xe5wfQ9vjUOxAliQN1rAME0mSBsgwkCQtvzAYpp+9mJRkXZK7kzya5FCS97X6h5P8MMnBdrpiwH0+nuTh1st4q52T5ECSw+181QD7+92uZXUwyU+TvH/QyzHJriTHkzzSVeu53NJxU3t9PpTkwgH2+J+SfK/18dUkK1t9fZL/07U8/2KAPU753Cb5YFuOjyW5bIA9frGrv8eTHGz1gSzHKVXVsjnR2Tn9feC1wFnAg8D5Q9DXauDCdvkVwP+k87McHwb+7aD76+rzceDcU2r/EdjZLu8EPj7oPrue6/8N/M6glyPwFuBC4JGZlhtwBfA1Ot+92QzcO8AeLwXOaJc/3tXj+u7pBrwcez637f3zIHA2cF57368YRI+n3P4J4D8McjlOdVpuWwZD+bMXVXWsqr7dLj8LPAqsGWxXfdsK7G6XdwNXDrCXbm8Dvl9VfzvoRqrqG8DJU8pTLbetwG3VcQ+wMsnqQfRYVV+vqufb1XvofP9nYKZYjlPZCuypqueq6gfABJ33/4KarsckAf4F8IWF7mMullsYrAGe6Lp+lCFb6SZZD1wA3NtK17XN9F2DHIJpCvh6kgfaT4MAvKaqjkEn1IBXD6y7F7uKF7/phmk5wtTLbVhfo/+KzhbLpPOSfCfJ/0jy+4Nqqun13A7jcvx94KmqOtxVG5rluNzCoK+fvRiUJC8Hvgy8v6p+CtwMvA7YCByjs4k5SG+uqgvp/LrsjiRvGXA/PbUvLv4B8N9aadiW43SG7jWa5EPA88DnW+kY8Her6gLg3wD/NckrB9TeVM/t0C1H4Gpe/AFlmJbjsguDof3ZiyRn0gmCz1fVVwCq6qmqeqGqfgl8lkXYzJ1OVT3Zzo8DX239PDU5jNHOjw+uw1+5HPh2VT0Fw7ccm6mW21C9RpNsA/4J8C+rDXS3oZcft8sP0BmP//uD6G+a53bYluMZwD8DvjhZG6blCMsvDIbyZy/aWOKtwKNV9cmuevdY8buAR06dd7Ek+a0kr5i8TGfn4iN0lt+2Ntk24I7BdPgiL/oENkzLsctUy20vcE07qmgz8MzkcNJiS7IF+ADwB1X18676WDr/g4QkrwU2AEcG1ONUz+1e4KokZyc5j06P9y12f13eDnyvqo5OFoZpOQLL62ii9sHmCjpH63wf+NCg+2k9/SM6m7APAQfb6QrgL4GHW30vsHqAPb6WztEZDwKHJpcd8CrgLuBwOz9nwMvy7wA/Bn67qzbQ5UgnmI4B/4/OJ9Zrp1pudIY3Pt1enw8DmwbY4wSdcffJ1+RftGn/eXsNPAh8G/inA+xxyucW+FBbjo8Blw+qx1b/HPAnp0w7kOU41cmfo5AkLbthIklSD4aBJMkwkCQZBpIkDANJEoaBJAnDQJIE/H+Kb2CKlHwH4gAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.bar(range(len(lg.feature_importances_)), lg.feature_importances_)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "tfidf的特征重要性更高一些。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  },
  "pycharm": {
   "stem_cell": {
    "cell_type": "raw",
    "metadata": {
     "collapsed": false
    },
    "source": []
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
